-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] Formal Support for Splitting BN and VC into Separate Processes #3088
Comments
Another reason to why separating the Beacon Node from the Validator Client is attack surface reduction. If separated from the BC, the VC could run isolated without having any open ports to the Internet. The BC is considerably more complex than the VC, and therefore has a larger attack surface. By coupling the VC with the BC, we may be unnecessarily exposing the VC to remote attacks, and increasing the risk of leaking key material. |
While this is often mentioned, it is generally the case that the complexity of running several processes in most cases leads to a worse total security outcome than a more simple setup with fewer moving parts, simply because the biggest threat to security is often the human factor, a lot more so than any technical factors - if we define "security" in this context as "preventing the loss of funds", this theory has a lot more backing in the real world: 100% of the slashings with known cause so far have been linked to different forms of user error related to overly complex validator client setups. |
I agree with your argument: human error is one of the main causes (maybe the main cause?) of security issues, and complexity makes it more difficult to secure infrastructure in general. That being said, I also include programmers in the human side of the equation. And programmers do make errors. After all, many security vulnerabilities can be attributed to vulnerabilities being introduced in the code by humans :) So, by splitting the the VC from the BC we could in fact shield users from human (developer) error. We could argue that an implementation error in the code can have far more serious consequences, by virtue of being amplified by thousand of vulnerable instances vs a couple of misconfigured ones. Isolating code with different privilege levels can help mitigate this. We could also argue that Ethereum's choice of having two split keys (validation key and withdrawal key) leads to more complexity, and indeed it does. People might misplace keys, confuse their purposes, etc. But it also self-evident that this design removes incentives from attacking validators, and results in overall improved security. Is there any documented real-world instance where having the VC separate from the BC has resulted in decreased security? P.S. - I know I won't to convince you with my arguments, but I also think that the "everything running in the same process" leads to more secure software is not consensual. There are many counter-examples of secure software that has adopted the "split components into different privilege levels in order to contain damage":
|
A good example is https://blog.staked.us/blog/eth2-post-mortem - this is a professional setup run by competent people, and yet failed because there was too much complexity on the VC side side without additional safety nets - it resulted in the loss of 75 validators. Again, this slightly depends on your definition of "security": if you constrain it to "a hacker gained access through technical means over the internet and did bad things", then a VC setup will naturally be more secure - if instead you mean it as "the architecture helped keep the keys online", then all slashings so far are unlikely to have happened had their owners pursued more simple setups.
Oh, I don't need convincing really - I'm fully aware that under the right conditions, there exists a chance that you'll be able to create a more secure setup with a split architecture - this architecture is getting implemented in nimbus as well, as an option to our users - in particular, you can already use Nimbus as a beacon node for alternative VC implementations. I'm merely pointing out that "keeping more validator keys online" is not one of the probable / predictable outcomes of the feature, quite the contrary: I'm quite certain more keys will be lost because of it. You're right though that we shouldn't trust the programmers either - we have multiple safety nets in place around keys, code review, auditing and so on that also act as filters for human errors. As an example, the VC architecture itself is quite complex: the VC does lots of things beyond pure signatures, and will be doing even more "soon" with the merge - if securing your keys is your goal, you're better off with our features for out-of-process signing - it runs a trivial "signing process" on the side that does nothing other than sign things - it's intended for hardware wallet signing and protocols like web3signer, and much more appropriate for keeping small security surface areas around private keys. |
Another situation where having beacon and validator processes split into separated services would be helpful is when using nimbus with the web3signer. You may want to switch the client you are using to validate at any given moment by stopping the validator service and keeping the beacon syncing. You should be able to stop validating with nimbus without stopping the beacon from syncing |
This has been implemented as of v22.11! |
One of the useful features of the other clients is the separation of the Beacon Node and the Validator Client into separate processes. This is quite beneficial for a few reasons:
It means users can preserve their validator key setups and keep a dedicated VC, but are able to move the BN around if necessary. For example, on low-power systems, sometimes (such as during a Sync Committee) the BN could become overwhelmed. It may be preferable to temporarily offload the BN to another machine with more resources available, and return to the original machine afterwards.
It would allow for a single BN to connect to multiple VCs. Rocket Pool's "Hybrid Mode" allows for a user to use an externally managed BN (say, for solo staking) and connect a separate VC that Rocket Pool manages to it. This mode is not compatible with Nimbus, meaning Rocket Pool must manage the entire stack which prevents solo stakers from leveraging it.
It allows users to experiment with different BN implementations. My understanding is that the BN and VC communicate via the standard Beacon REST API, and thus should be interchangeable. This would allow, for example, a Nimbus BN to attach to a Lighthouse VC (or vice versa); users could sync Nimbus in the background, and when it's ready, point their VC at it instead. This would help improve client diversity by providing an easy way to experiment with the different BNs without the risk of slashing.
It would allow for BN failover to be an option; users could keep a VC running and attach multiple BN endpoints to it so that attestations are not lost in case one of the BNs goes down.
When creating a new validator key, currently we have to restart the Nimbus client for it to pick up the new validator. This spin-up can take a long time. Splitting the clients would only require the VC to be restarted, which is comparatively a much faster task.
I have experimented with this split-process mode on Prater, and while it worked well enough for the testnet, I was advised that it is not considered production-ready yet. I encourage the team to look into finishing support for it because we would most certainly leverage it.
The text was updated successfully, but these errors were encountered: