Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some provider subsystem performance issues #319

Merged
merged 10 commits into from
Apr 13, 2019
Merged

Conversation

Stebalien
Copy link
Member

@Stebalien Stebalien commented Apr 13, 2019

  • Don't load into the read cache on write.
  • Delete on expiration.
  • Delete expired records on load.
  • Avoid populating the cache when garbage collecting (this just destroys the cache and hurts performance). Just drop the cache.
  • Make the algorithm not n^2. I believe keys() returns the CID once per provider (and we then iterate over all providers once per provider).
  • Use a non thread-safe cache (don't need thread safety).

related: #316

@ghost ghost assigned Stebalien Apr 13, 2019
@ghost ghost added the status/in-progress In progress label Apr 13, 2019
@Stebalien Stebalien force-pushed the fix/provider-mayhem branch 2 times, most recently from 84b9fac to 3ab4992 Compare April 13, 2019 05:12
@Stebalien Stebalien marked this pull request as ready for review April 13, 2019 05:12
@Stebalien Stebalien force-pushed the fix/provider-mayhem branch from 3ab4992 to c095545 Compare April 13, 2019 06:20
@Stebalien
Copy link
Member Author

Stebalien commented Apr 13, 2019

Note: We should be doing GC in a separate thread but it's a little tricky to get this right without race conditions. We can save this for a follow-up PR.

@Stebalien Stebalien force-pushed the fix/provider-mayhem branch from c095545 to 7ce5021 Compare April 13, 2019 06:40
@Stebalien Stebalien requested a review from vyzo April 13, 2019 06:44
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -30,7 +30,7 @@ var defaultCleanupInterval = time.Hour
type ProviderManager struct {
// all non channel fields are meant to be accessed only within
// the run method
providers *lru.Cache
providers *lru.LRU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between these two?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess thread-safety.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is safe? It could come back to bite us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The old provider set wasn't safe to use from multiple threads anyways as we'd modify the *providerSet structs.

provs.setVal(p, now)
if provs, ok := pm.providers.Get(k); ok {
provs.(*providerSet).setVal(p, now)
} // else not cached, just write through
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

data, ok := i.([]byte)
if !ok {
return time.Time{}, fmt.Errorf("data was not a []byte")
func readTimeValue(data []byte) (time.Time, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoiding the interface{} is a nice little gain! No more allocation for passing byte slices.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this was left over from the datastore refactor when we replaced all the interfaces with []byte)

fallthrough
case now.Sub(t) > ProvideValidity:
// or just expired
err = pm.dstore.Delete(ds.RawKey(e.Key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we also clean the lru cache here? Alternatively we can simply flush it completely on gc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there is a Purge call before entering gc, good.

// drop them.
//
// Much faster than GCing.
pm.providers.Purge()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay for that!

@vyzo
Copy link
Contributor

vyzo commented Apr 13, 2019

Only concern here is removing the thread-safety from the LRU cache.

@Stebalien Stebalien force-pushed the fix/provider-mayhem branch from 6f26a7a to 7bdc7a5 Compare April 13, 2019 16:19
@Stebalien Stebalien merged commit 12c9510 into master Apr 13, 2019
@Stebalien Stebalien deleted the fix/provider-mayhem branch April 13, 2019 16:22
@ghost ghost removed the status/in-progress In progress label Apr 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants