Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] cluster management for FuseQuery #883

Closed
zhang2014 opened this issue Jun 23, 2021 · 15 comments · Fixed by #1011
Closed

[FR] cluster management for FuseQuery #883

zhang2014 opened this issue Jun 23, 2021 · 15 comments · Fixed by #1011
Assignees
Labels
C-cloud-re-architecture Category: cloud architecture C-feature Category: feature prio: high High priority

Comments

@zhang2014
Copy link
Member

Summary

FuseQuery is a stateless service. It should be very easy to scale in and scale out. for example:

# When I want to start FuseQuery with standalone mode:
: ) ./fuse_query --store fuse_store_host_name:9191
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191


# When I want to start FuseQuery with cluster mode:
: ) ./fuse_query --store fuse_store_host_name:9191 --namespace cluster_1
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191 --namespace cluster_1

# When I want to start FuseQuery and join an existing cluster
: ) ./fuse_query --store fuse_store_host_name:9191 --namespace exists_cluster_name
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191 --namespace exists_cluster_name

In this case, we need FuseStore as registry center. we may need the following APIs:

  • Register node(namespace, node_name, ip, flight_port, CPUs, Memory, heartbeat_interval)
  • Heartbeat(namespace, node_name)
  • GetRegisteredNodes

CC: @dantengsky @BohuTANG

@zhang2014 zhang2014 added the C-feature Category: feature label Jun 23, 2021
@BohuTANG
Copy link
Member

Very cool.

Now store:

  1. storage
  2. metadata
  3. KV API for states

For cluster node register, 2 questions:
Q1: Only KV API is ok?
Q2: Need subscriber and notify?

@zhang2014
Copy link
Member Author

zhang2014 commented Jun 23, 2021

Q1: Only KV API is ok?

It's ok, but we need expire_key and get_prefix(not necessary by tow level keys design) api.

Q2: Need subscriber and notify?

It's not necessary. we can fetch it once every query. but it's better to provide subscriber and notify

@BohuTANG
Copy link
Member

cloud service make sense.
cloudservice is a builtin or builton of the store :)

@dantengsky
Copy link
Member

I have a strong feeling that in the long run, we need another layer like snowflake‘s cloud service, and separate meta authentication/authorization, infra management etc. from Store layer.

cloud service make sense.
cloudservice is a builtin or builton of the store :)

oops, I just deleted the comment, I "recover" it here. :D

@BohuTANG BohuTANG added this to the v0.5 milestone Jun 23, 2021
@drmingdrmer
Copy link
Member

I have a strong feeling that in the long run, we need another layer like snowflake‘s cloud service, and separate meta authentication/authorization, infra management etc. from Store layer.

cloud service make sense.
cloudservice is a builtin or builton of the store :)

oops, I just deleted the comment, I "recover" it here. :D

Making meta service a standalone sub system? That smells good.

@jovany-wang
Copy link

jovany-wang commented Jun 23, 2021

I have some thoughts which may be naive but I think they're worth sharing for your reference.

My proposal here is to abstract a component ClusterManagement(CloudServiceis also fine to me) to manage our cluster, including nodes, plans and others. It has some features:

  • (1) It's stateless and it ports external meta storage as plugin.
  • (2) It provides SQL-like API.
  • (3) It supports listen/notify.

Let me elaborate the details one by one.

(1) It's stateless and it ports external meta storage as plugin.

The purpose for stateless is cloud native, and the purpose for plugable is to let datafuse be easy to use for users from different companies with different infra backgrounds.
Current RAFT meta storage in datafuse can be used as a default meta storage for ClusterManagement. And it's able to be specified an external storage as well, such as etcd, tidb, Redis and etc.

(2) It provides SQL-like API.

From the experience, most of distributed systems need manage their nodes, pods, plans, executors, tasks and more. And they are not independent, such as we need select all plans from one node or select the executors which are involved in one computing process. That means structured data will be more easy to be use at runtime.
Additional: Almost of cluster management operations are CURD operations.

(3) It supports listen/notify.

It's necessary to have an ATLEAST ONCE notifying mechanism. It can speed the failure-over process up. it also avoid fetching info every time we need.

Briefly, I think the first one is the most important of this arch, because other points can be added as increment.

Please correct me if some thing wrong.

@drmingdrmer
Copy link
Member

@jovany-wang Thanks for sharing.

If the ClusterService is stateless, it can be simply a lib right?
Does it have to be some kind of service?

@jovany-wang
Copy link

@jovany-wang Thanks for sharing.

If the ClusterService is stateless, it can be simply a lib right?

Yes.

Does it have to be some kind of service?

I'm afraid there's no such service in open source.

Anyway there are visible benefits to make meta storage plugable.

@drmingdrmer
Copy link
Member

@jovany-wang Looking forward to a more detailed design of it. 😃

E.g., the abstraction layer API, what the sql layer schema would be like and how to implement listen with sql.

BTW an example/illustration of a complete workflow of fusequery node accessing the meta data would make your idea concrete, intuitive and easy to understand for everybody.

@dantengsky
Copy link
Member

@jovany-wang

Really appreciate it!!

My proposal here is to abstract a component ClusterManagement(CloudServiceis also fine to me) to manage our cluster, including nodes, plans and others. It has some features:

  • (1) It's stateless and it ports external meta storage as plugin.
  • (2) It provides SQL-like API.
  • (3) It supports listen/notify.

Let me elaborate the details one by one.

(1) It's stateless and it ports external meta storage as plugin.

The purpose for stateless is cloud native, and the purpose for plugable is to let datafuse be easy to use for users from different companies with different infra backgrounds.
Current RAFT meta storage in datafuse can be used as a default meta storage for ClusterManagement. And it's able to be specified an external storage as well, such as etcd, tidb, Redis and etc.

Does this mean that Query should have some approaches to access different catalogs, like Spark's CatalogPlugin does?

metadata of extra catalogs are not stored in datafuse's Store layer.
Configurations of catalog plugins may be kept in the Store, and provides to Query nodes as needed.

That would be nice, we should have extra abilities to read/write non-built-in data format though.

@dantengsky
Copy link
Member

(2) It provides SQL-like API.

From the experience, most of distributed systems need manage their nodes, pods, plans, executors, tasks and more. And they are not independent, such as we need select all plans from one node or select the executors which are involved in one computing process. That means structured data will be more easy to be use at runtime.
Additional: Almost of cluster management operations are CURD operations.

(3) It supports listen/notify.

It's necessary to have an ATLEAST ONCE notifying mechanism. It can speed the failure-over process up. it also avoid fetching info every time we need.

Wow, those are exciting features! Looking forward to your further comments/docs.
Again, your opinions are very helpful, truly thanks!

@BohuTANG
Copy link
Member

BohuTANG commented Jun 24, 2021

For datafuse, I think we need a common/cloud_service mod(Mainly for stateful synchronisation and management), it includes many components, cluster manager is one of them.

For this part, it should support:

1. get_executors_by_namespace(namespace:&str) -> Result<Vec<ExecutorNode>>
2. upsert_executors(namespace:&str, ndoes: Vec<ExecutorNode>) -> Result<()>
3. remove_executors(namespace:&str, ndoes: Vec<ExecutorNode>) -> Result<()>

All datas are serialized and stored in the Store.

Will meet all the needs?

@jovany-wang
Copy link

Does this mean that Query should have some approaches to access different catalogs, like Spark's CatalogPlugin does?

metadata of extra catalogs are not stored in datafuse's Store layer.
Configurations of catalog plugins may be kept in the Store, and provides to Query nodes as needed.

That would be nice, we should have extra abilities to read/write non-built-in data format though.

My proposal here is for meta storage only.
But what you mentioned here is a good idea to me. Spark has the abilities to access data in different formats by using plugin.

@BohuTANG BohuTANG self-assigned this Jun 25, 2021
@BohuTANG
Copy link
Member

This task will start when #881 is done.

Initial design:

  1. fuse-store supports common kv api (after Simple user API (based on Store KV service) #881 )
  2. cluster informations serialized to kv and stored in fuse-store
  3. fuse-query can add/update/get cluster informations with namespace via the api

@jovany-wang
Copy link

This task will start when #881 is done.

Initial design:

  1. fuse-store supports common kv api (after Simple user API (based on Store KV service) #881 )
  2. cluster informations serialized to kv and stored in fuse-store
  3. fuse-query can add/update/get cluster informations with namespace via the api

Looks good to me for current stage.

@BohuTANG BohuTANG changed the title [FR] Refactor cluster management for FuseQuery [FR] cluster management for FuseQuery Jul 5, 2021
@BohuTANG BohuTANG added the prio: high High priority label Jul 5, 2021
@BohuTANG BohuTANG added the C-cloud-re-architecture Category: cloud architecture label Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cloud-re-architecture Category: cloud architecture C-feature Category: feature prio: high High priority
Projects
None yet
5 participants