Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to avoid memory copies between jacobsa FUSE library and application code #143

Open
sbauersfeld opened this issue May 11, 2023 · 4 comments

Comments

@sbauersfeld
Copy link
Contributor

sbauersfeld commented May 11, 2023

Hello,

I work with a team of engineers and we are using a FUSE filesystem that uses the jacobsa library. Recently, we have been looking into performance improvements for reading and writing large files, and we found that copying data to the ReadFile Dst buffer and from the WriteFile Data buffer is reducing our performance. We created a prototype to remove these copies and saw up to 100-200 MB/s throughput performance improvements (~500 MB/s -> ~700 MB/s) for reads and (~400 -> ~600 MB/s) for writes.

The "vectored read" improvement added in this PR is an important first step for skipping the data copy in ReadFile, but unfortunately it does not seem to fully satisfy our use case, because we don't have any way of knowing when the data has been successfully written back to the kernel (and thus, we don't know when we can free up the data we are holding in memory).

For WriteFile, we have a similar problem, because we want to return a response to the kernel as soon as possible, but we may not yet be ready to free up the InMessage containing the data for the WriteFile request.

I am open to suggestions here, but our proposal to satisfy our use cases for both ReadFile and WriteFile is to add an interface that would essentially wrap the Freelists that are used to get and free up InMessage and OutMessage.

For example, we would add an interface that looks something like:

type MessageProvider interface {

  func GetInMessage(size) *InMessage
  
  func GetOutMessage(size) *OutMessage
  
  func PutInMessage(InMessage)
  
  func PutOutMessage(OutMessage)

}

The mount configurations could allow a custom implementation of this interface to be provided, so that our application code can add hooks to both get notified when the In/OutMessage is no longer needed because the response has been written to the kernel, and we can also control when the In/OutMessage is actually freed up for future reuse.

If no custom implementation is provided in the mount configuration, then a default implementation of this interface would be a lightweight wrapper around the Freelists that exist today.

I am very happy to create a pull request for this issue. Do you have any particular thoughts and/or concerns before I do so?

Thanks,
Scott

@stapelberg
Copy link
Collaborator

Hey Scott! That all seems reasonable to me. I haven’t looked much at the freelists TBH (they predate my involvement with the package), but we can hash out details over the actual PR. No thoughts or concerns beforehand :)

@sbauersfeld
Copy link
Contributor Author

Hi @stapelberg, I created a PR to address this issue: #144

Could you please take a look when you get a chance? Thanks!

@stapelberg
Copy link
Collaborator

Thanks, I saw. I’m currently traveling for work, but I hope to get around to your PR next week.

@sbauersfeld
Copy link
Contributor Author

Hi @stapelberg , sorry to bother you again, but I would really appreciate it if you could review the PR sometime this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants