[FR] cluster management for FuseQuery #883

zhang2014 · 2021-06-23T06:06:01Z

Summary

FuseQuery is a stateless service. It should be very easy to scale in and scale out. for example:

# When I want to start FuseQuery with standalone mode:
: ) ./fuse_query --store fuse_store_host_name:9191
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191


# When I want to start FuseQuery with cluster mode:
: ) ./fuse_query --store fuse_store_host_name:9191 --namespace cluster_1
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191 --namespace cluster_1

# When I want to start FuseQuery and join an existing cluster
: ) ./fuse_query --store fuse_store_host_name:9191 --namespace exists_cluster_name
# Or
: ) ./fuse_query --store 192.168.0.1:9191;192.168.0.2:9191 --namespace exists_cluster_name

In this case, we need FuseStore as registry center. we may need the following APIs:

Register node(namespace, node_name, ip, flight_port, CPUs, Memory, heartbeat_interval)
Heartbeat(namespace, node_name)
GetRegisteredNodes

CC: @dantengsky @BohuTANG

BohuTANG · 2021-06-23T06:24:57Z

Very cool.

Now store:

storage
metadata
KV API for states

For cluster node register, 2 questions:
Q1: Only KV API is ok?
Q2: Need subscriber and notify?

zhang2014 · 2021-06-23T06:37:42Z

Q1: Only KV API is ok?

It's ok, but we need expire_key and get_prefix(not necessary by tow level keys design) api.

Q2: Need subscriber and notify?

It's not necessary. we can fetch it once every query. but it's better to provide subscriber and notify

BohuTANG · 2021-06-23T07:22:13Z

cloud service make sense.
cloudservice is a builtin or builton of the store :)

dantengsky · 2021-06-23T07:33:35Z

I have a strong feeling that in the long run, we need another layer like snowflake‘s cloud service, and separate meta authentication/authorization, infra management etc. from Store layer.

cloud service make sense.
cloudservice is a builtin or builton of the store :)

oops, I just deleted the comment, I "recover" it here. :D

drmingdrmer · 2021-06-23T10:36:02Z

I have a strong feeling that in the long run, we need another layer like snowflake‘s cloud service, and separate meta authentication/authorization, infra management etc. from Store layer.

cloud service make sense.
cloudservice is a builtin or builton of the store :)

oops, I just deleted the comment, I "recover" it here. :D

Making meta service a standalone sub system? That smells good.

jovany-wang · 2021-06-23T11:42:18Z

I have some thoughts which may be naive but I think they're worth sharing for your reference.

My proposal here is to abstract a component ClusterManagement(CloudServiceis also fine to me) to manage our cluster, including nodes, plans and others. It has some features:

(1) It's stateless and it ports external meta storage as plugin.
(2) It provides SQL-like API.
(3) It supports listen/notify.

Let me elaborate the details one by one.

(1) It's stateless and it ports external meta storage as plugin.

The purpose for stateless is cloud native, and the purpose for plugable is to let datafuse be easy to use for users from different companies with different infra backgrounds.
Current RAFT meta storage in datafuse can be used as a default meta storage for ClusterManagement. And it's able to be specified an external storage as well, such as etcd, tidb, Redis and etc.

(2) It provides SQL-like API.

From the experience, most of distributed systems need manage their nodes, pods, plans, executors, tasks and more. And they are not independent, such as we need select all plans from one node or select the executors which are involved in one computing process. That means structured data will be more easy to be use at runtime.
Additional: Almost of cluster management operations are CURD operations.

(3) It supports listen/notify.

It's necessary to have an ATLEAST ONCE notifying mechanism. It can speed the failure-over process up. it also avoid fetching info every time we need.

Briefly, I think the first one is the most important of this arch, because other points can be added as increment.

Please correct me if some thing wrong.

drmingdrmer · 2021-06-23T13:11:58Z

@jovany-wang Thanks for sharing.

If the ClusterService is stateless, it can be simply a lib right?
Does it have to be some kind of service?

jovany-wang · 2021-06-23T14:25:15Z

@jovany-wang Thanks for sharing.

If the ClusterService is stateless, it can be simply a lib right?

Yes.

Does it have to be some kind of service?

I'm afraid there's no such service in open source.

Anyway there are visible benefits to make meta storage plugable.

drmingdrmer · 2021-06-24T02:15:37Z

@jovany-wang Looking forward to a more detailed design of it. 😃

E.g., the abstraction layer API, what the sql layer schema would be like and how to implement listen with sql.

BTW an example/illustration of a complete workflow of fusequery node accessing the meta data would make your idea concrete, intuitive and easy to understand for everybody.

dantengsky · 2021-06-24T07:48:24Z

@jovany-wang

Really appreciate it!!

My proposal here is to abstract a component ClusterManagement(CloudServiceis also fine to me) to manage our cluster, including nodes, plans and others. It has some features:

(1) It's stateless and it ports external meta storage as plugin.

(2) It provides SQL-like API.

(3) It supports listen/notify.

Let me elaborate the details one by one.

(1) It's stateless and it ports external meta storage as plugin.

The purpose for stateless is cloud native, and the purpose for plugable is to let datafuse be easy to use for users from different companies with different infra backgrounds.
Current RAFT meta storage in datafuse can be used as a default meta storage for ClusterManagement. And it's able to be specified an external storage as well, such as etcd, tidb, Redis and etc.

Does this mean that Query should have some approaches to access different catalogs, like Spark's CatalogPlugin does?

metadata of extra catalogs are not stored in datafuse's Store layer.
Configurations of catalog plugins may be kept in the Store, and provides to Query nodes as needed.

That would be nice, we should have extra abilities to read/write non-built-in data format though.

dantengsky · 2021-06-24T08:04:12Z

(2) It provides SQL-like API.

From the experience, most of distributed systems need manage their nodes, pods, plans, executors, tasks and more. And they are not independent, such as we need select all plans from one node or select the executors which are involved in one computing process. That means structured data will be more easy to be use at runtime.
Additional: Almost of cluster management operations are CURD operations.

(3) It supports listen/notify.

It's necessary to have an ATLEAST ONCE notifying mechanism. It can speed the failure-over process up. it also avoid fetching info every time we need.

Wow, those are exciting features! Looking forward to your further comments/docs.
Again, your opinions are very helpful, truly thanks!

BohuTANG · 2021-06-24T08:56:26Z

For datafuse, I think we need a common/cloud_service mod(Mainly for stateful synchronisation and management), it includes many components, cluster manager is one of them.

For this part, it should support:

1. get_executors_by_namespace(namespace:&str) -> Result<Vec<ExecutorNode>>
2. upsert_executors(namespace:&str, ndoes: Vec<ExecutorNode>) -> Result<()>
3. remove_executors(namespace:&str, ndoes: Vec<ExecutorNode>) -> Result<()>

All datas are serialized and stored in the Store.

Will meet all the needs?

jovany-wang · 2021-06-24T09:50:46Z

Does this mean that Query should have some approaches to access different catalogs, like Spark's CatalogPlugin does?

metadata of extra catalogs are not stored in datafuse's Store layer.
Configurations of catalog plugins may be kept in the Store, and provides to Query nodes as needed.

That would be nice, we should have extra abilities to read/write non-built-in data format though.

My proposal here is for meta storage only.
But what you mentioned here is a good idea to me. Spark has the abilities to access data in different formats by using plugin.

BohuTANG · 2021-06-25T03:11:07Z

This task will start when #881 is done.

Initial design:

fuse-store supports common kv api (after Simple user API (based on Store KV service) #881 )
cluster informations serialized to kv and stored in fuse-store
fuse-query can add/update/get cluster informations with namespace via the api

jovany-wang · 2021-06-26T06:09:53Z

This task will start when #881 is done.

Initial design:

fuse-store supports common kv api (after Simple user API (based on Store KV service) #881 )

cluster informations serialized to kv and stored in fuse-store

fuse-query can add/update/get cluster informations with namespace via the api

Looks good to me for current stage.

zhang2014 added the C-feature Category: feature label Jun 23, 2021

BohuTANG added this to the v0.5 milestone Jun 23, 2021

BohuTANG mentioned this issue Jun 24, 2021

[cluster] add executor interface #898

Closed

BohuTANG self-assigned this Jun 25, 2021

This was referenced Jun 25, 2021

[store/meta] feature: add "unclassified" namespace to meta to store everything else a user need. #902

Merged

[store] feature: store-flight provides general purpose kv api: upser_kv() and get_kv() #906

Merged

BohuTANG changed the title ~~[FR] Refactor cluster management for FuseQuery~~ [FR] cluster management for FuseQuery Jul 5, 2021

BohuTANG added the prio: high High priority label Jul 5, 2021

This was referenced Jul 5, 2021

[roadmap-track] Query cluster #747

Closed

ISSUE-883: cluster management #991

Closed

ISSUE-883: cluster management with namespace #1011

Merged

zhang2014 mentioned this issue Jul 26, 2021

ISSUE-883 continue cluster management with namespace #1184

Closed

dantengsky mentioned this issue Aug 3, 2021

Integration test with public datasets #1266

Closed

zhang2014 mentioned this issue Aug 5, 2021

[api] move warp to tokio-rs/axum #1256

Closed

BohuTANG added the C-cloud-re-architecture Category: cloud architecture label Aug 12, 2021

BohuTANG mentioned this issue Aug 23, 2021

Refactoring: Lift LocalDatabase into DataSource APIs #1568

Closed

zhang2014 mentioned this issue Sep 25, 2021

ISSUE-883 continue cluster management with namespace #1731

Merged

databend-bot closed this as completed in #1011 Sep 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] cluster management for FuseQuery #883

[FR] cluster management for FuseQuery #883

zhang2014 commented Jun 23, 2021

BohuTANG commented Jun 23, 2021

zhang2014 commented Jun 23, 2021 •

edited

Loading

BohuTANG commented Jun 23, 2021

dantengsky commented Jun 23, 2021

drmingdrmer commented Jun 23, 2021

jovany-wang commented Jun 23, 2021 •

edited

Loading

drmingdrmer commented Jun 23, 2021

jovany-wang commented Jun 23, 2021

drmingdrmer commented Jun 24, 2021

dantengsky commented Jun 24, 2021

(1) It's stateless and it ports external meta storage as plugin.

dantengsky commented Jun 24, 2021

(2) It provides SQL-like API.

(3) It supports listen/notify.

BohuTANG commented Jun 24, 2021 •

edited

Loading

jovany-wang commented Jun 24, 2021

BohuTANG commented Jun 25, 2021

jovany-wang commented Jun 26, 2021

[FR] cluster management for FuseQuery #883

[FR] cluster management for FuseQuery #883

Comments

zhang2014 commented Jun 23, 2021

BohuTANG commented Jun 23, 2021

zhang2014 commented Jun 23, 2021 • edited Loading

BohuTANG commented Jun 23, 2021

dantengsky commented Jun 23, 2021

drmingdrmer commented Jun 23, 2021

jovany-wang commented Jun 23, 2021 • edited Loading

(1) It's stateless and it ports external meta storage as plugin.

(2) It provides SQL-like API.

(3) It supports listen/notify.

drmingdrmer commented Jun 23, 2021

jovany-wang commented Jun 23, 2021

drmingdrmer commented Jun 24, 2021

dantengsky commented Jun 24, 2021

(1) It's stateless and it ports external meta storage as plugin.

dantengsky commented Jun 24, 2021

(2) It provides SQL-like API.

(3) It supports listen/notify.

BohuTANG commented Jun 24, 2021 • edited Loading

jovany-wang commented Jun 24, 2021

BohuTANG commented Jun 25, 2021

jovany-wang commented Jun 26, 2021

zhang2014 commented Jun 23, 2021 •

edited

Loading

jovany-wang commented Jun 23, 2021 •

edited

Loading

BohuTANG commented Jun 24, 2021 •

edited

Loading