-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: remote compute server (proving k2pow) #343
Comments
Hi @tjb-altf4, thanks for sharing the idea. I'm adding ideas for a possible high-level design below. Requirements:
The interface between the node and the post-service would remain unchanged. That is, the node requests a PoST proof generation from the post-service and prompts for a result in some intervals. The post-service creates the k2pow according to its configuration (either by calculating it or requesting it from the external service). It then continues with PoST proving. |
Thanks for responding @poszu, great to see requirements coming together already. I’d suggest considering whether the k2pow service should manage and maintain its own work queue. I believe this additional requirement would have the benefit of also resolving another Smesher UX improvements item (Simple orchestrator). |
Yes, it totally makes sense to queue requests for calculating the k2pow as this is a heavily CPU-hungry task. I updated the requirements. |
I'd probably vote, though, for "no queue" at all and some sensible error back. Why? Then, it's easier to set some "auto scaling" feature before the service itself (just based on requests), the requirement of keeping the state locally just in case is valid. |
Hi there 👋 @poszu I'm trying to figure out exactly which code needs to be pulled out to a separate service. Am I correct with the understanding that the
Does this mean that we should assign a Also, Re:
Would it be possible to have a bit more specifics here? I'm not sure I fully understand this in practice. RandomX doesn't really give the ability to configure anything around the CPU AFAIU. How would you envision this optimal resource usage?
How would you know how much to scale the service? Also, if you assume that the service is load-balanced, all requests to execute anything must be blocking calls (otherwise how would the caller know how to land the call back on the same node?) which isn't clear on how to do. Maybe some more specifics here would help. |
About scaling.
And then we gain two things: If we however do the queueing on the worker side then we need to implement all signaling for full queue, etc. Imo unnecessary complication. So to summ up
|
We allow up to X CPU threads for randomX computation. It's called workers in this codebase, one worker is one CPU thread. @poszu checked and on some CPUs it makes no sense to use more than Y cores as then it's even "not faster anymore" because of CPU architecture. Plus afaik it was also per numa group. |
Right, so my understanding here that the solution is already opinionated about how it will be used with a load-balancer. I'm just trying to make sure because it just sounds like it's not gonna be self-contained. I.e., if you want to use the k2pow in an external service configuration which will have more than one instance/worker, then you'll have to build an external service with a specific load balancer configuration and maybe other levels of tooling that would get the results and store them on redis. I guess that the expectation is that the users would build that tooling? Or are we going to offer a complete solution? |
I think for now we can assume that it's good enough to have one instance that knows all. I queue makes it simple we can try (we can always make queue with size 1) and reply "queue full" But in general running more than one k2pow (randomX) per CPU (not thread, not cores) will not make it faster likely even slower. Even with the queue we don't need to delete I guess. The other possibility would be to make some MQ logic or take some off the shelf like nats and do proper fanin-fanout. But sounds like overkill in the first iteration:) |
Yes, probably into a separate crate to avoid pulling in GRPC- and CLI-related dependencies into the library. Similiarily as the certifier and post services are done.
No need for a GUID. The set of input parameters
Yes, a separate crate (look above for why).
I think it keep the proto files in the api repo similarly to the
The K2pow prover uses rayon to parallelize computing RandomX hashes for multiple nonces in parallel. This is a CPU-heavy task and there is no point in using more threads than CPU cores (or configurable value). I think that the best approach would be to run 1 PoW using all cores at a time (and decide whether to queue the other incoming requests or reject them with "try again later" (UNAVAILABLE status perhaps? See: https://grpc.io/docs/guides/status-codes). |
Request to allow offloading of k2pow to a separate server.
This problem was partially solved by the 1:N post-service feature, however it still relies upon compute being executed where storage exists or relying on suboptimal network share performance.
The idea of this feature is to allow separation of concerns between a low power storage server such as an Intel N100 and a high power compute node, such as a gaming computer or dedicated ex-enterprise server.
This would provide security to the network, but lower electricity costs for smeshers.
As background information, this feature was added to h9-miner earlier in the year.
I would like to see the feature introduced for official software to reduce incentive to move to h9-miner and help support "free range" netspace.
The text was updated successfully, but these errors were encountered: