-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add block and actor caches #766
Conversation
1bf1a80
to
4ad1a0a
Compare
Codecov Report
@@ Coverage Diff @@
## master #766 +/- ##
======================================
Coverage 32.8% 32.8%
======================================
Files 40 40
Lines 3704 3704
======================================
Hits 1215 1215
Misses 2353 2353
Partials 136 136 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff!
I'd like to give this a second look after some of the comments are discussed/addressed.
main.go
Outdated
&cli.IntFlag{ | ||
Name: "blockstore-cache-size", | ||
EnvVars: []string{"LILY_BLOCKSTORE_CACHE_SIZE"}, | ||
Value: 0, | ||
Destination: &commands.CacheFlags.BlockstoreCacheSize, | ||
}, | ||
&cli.IntFlag{ | ||
Name: "statestore-cache-size", | ||
EnvVars: []string{"LILY_STATESTORE_CACHE_SIZE"}, | ||
Value: 0, | ||
Destination: &commands.CacheFlags.StatestoreCacheSize, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to have these here instead of the daemon command? I'd prefer these to live on the daemon since they don't have an effect on other commands when provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also make these Uint flags since negative values are disallowed -- error cases creating the caches could be safely ignored as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, could be on daemon. I can move them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like using uint in general because Go has no overflow detection. They're best reserved for bitfields or masks. But I don't feel that strongly in this case. We use int64 for heights without a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc we use signed ints for height because that's the underlying type of ChainEpoch
and specifying a negative epoch allows someone to specify a time before the genesis block. Although I don't think that's used anywhere, yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to uint on the flags and dependency injection structs
storage/sql_test.go
Outdated
@@ -18,7 +18,7 @@ import ( | |||
"github.com/filecoin-project/lily/testutil" | |||
) | |||
|
|||
const defaultDatabaseWaitTime = time.Minute * 1 | |||
const defaultDatabaseWaitTime = time.Minute * 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
commands/lily.go
Outdated
@@ -39,7 +39,7 @@ func NewSentinelNodeRPC(ctx context.Context, addr string, requestHeader http.Hea | |||
} | |||
|
|||
// Lily Node settings for injection into lotus node. | |||
func LilyNodeAPIOption(out *lily.LilyAPI, fopts ...node.Option) node.Option { | |||
func LilyNodeAPIOption(out *lily.LilyAPI, statestoreCacheSize int, fopts ...node.Option) node.Option { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remnant from a previous implementation? Looks like CacheConfig replaced this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lens/lily/modules/store.go
Outdated
}) | ||
} | ||
|
||
if cacheSize <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This forces users to pass the blockstore-cache-size
flag since its default value is 0
- not the best UX. Can we either pick a nice default for the cache size or omit the check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to remove this
log.Infof("not creating caching statestore (size=%d)", m.CacheConfig.StatestoreCacheSize) | ||
} | ||
|
||
m.actorStore = m.ChainAPI.Chain.ActorStore(context.TODO()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be within the above else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. it creates a non-caching actor store as a fallback from 2 scenarios: there was an error returned from util.NewCachingStateStore
or m.CacheConfig.StatestoreCacheSize<=0
(if no error returned from util.NewCachingStateStore
then we have already returned from the function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah whoops, I missed the return in the inner-most else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to make clearer what the flow is here
} | ||
|
||
func NewCachingStateStore(blocks blockstore.Blockstore, cacheSize int) (*CachingStateStore, error) { | ||
cache, err := lru.NewARC(cacheSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc this only errors when a negative value is passed, could panic here instead and simplify some of the calling code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's good UX to panic on user input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, and enforcement of positive integers could be done with Uint
flags, but I'm alright with this as is.
} | ||
|
||
func NewCachingBlockstore(blocks blockstore.Blockstore, cacheSize int) (*CachingBlockstore, error) { | ||
cache, err := lru.NewARC(cacheSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto comment wrt the error check here.
b14648d
to
cdc84e0
Compare
Adds a block cache that sits just above badger and a state cache in the Lily API lens. The nlock cache holds byte slices keyed by cid, whereas the state cache holds decoded cbor keyed by cid. Both can reduce i/o use by Lily which is a major performance bottleneck. The state cache can get up to 90% hit rate during a walk indicating that there are many repeated reads of actor state. Without the state cache the block cache can also reach 90%+ hit rate, but when both are used together it generally falls to about 50%. i/o utilization falls by about 90% in both cases.
Cache sizes are configured by two new flags when starting lily or by setting the
LILY_BLOCKSTORE_CACHE_SIZE
andLILY_STATESTORE_CACHE_SIZE
environment variables.This PR also contains an unrelated change: increase the database timeout used for running tests.