High availability #169

mostafa · 2023-02-26T23:03:25Z

This is to ensure HA of GatewayD by running a cluster of machines that can connect together and serve clients. So, plan and create tickets for all the following features and start implementing them.

Resources

sinadarbouy · 2024-10-17T16:01:24Z

For this issue, I think we can solve it by using github.com/hashicorp/raft (as mentioned in the issue description) to handle the state and coordination between nodes.

Here’s how I see it working:

Expose a Raft Port: We’ll need to open up an extra port for Raft. Then, during startup, all the nodes can connect and form a Raft cluster.
Single Raft Cluster for All Config Groups: Instead of having a separate Raft cluster for each configuration group, we can just have one for all of them. It should simplify things and reduce overhead.
Handling Stateful Parameters: We can store stateful parameters as key-value pairs, similar to how we handle it in the Redis plugin(configurationGroup-Configurationblock-Key). Raft will help ensure all nodes stay in sync with these values.
Fetching State Variables from Files: For things like connection counts, we can store them in a file and fetch them when needed. Since this usually happens in the OnOpen phase and during connection setup, performance shouldn’t be an issue.

With this approach, if we have three instances of GatewayD running, they can all receive requests, but they’ll rely on Raft to fetch the stateful variables through a voting process, ensuring everything stays consistent before creating connection between the client and DB.

If this approach sounds good, I can start working on it.

mostafa · 2024-10-17T19:16:55Z

After some investigation and the fact that Gossip protocol libraries are old and unmaintained, I think the go to approach is to use Raft, considering that Kafka also used it to move away from ZooKeeper. I think we should stick with simplicity and ease of use, as you also mentioned, rather than creating a Raft per tenant. We can also consider storing the state variables in SQLite or ObjectBox.

Let's create another ticket and link it to this one.

sinadarbouy · 2024-10-17T21:19:05Z

I checked again, and it turns out we don’t need to store our state in a file. HashiCorp Raft already uses BoltDB to handle the Raft logs for persistence and recovery. We can just use sync.Map to keep our state in memory since we’re only working with simple key-value data. Since we don’t have complex data, skipping a database like ObjectBox should be fine, as long as we rely on Raft for consistency and recovery.

mostafa added this to GatewayD Core Public Roadmap Feb 26, 2023

mostafa converted this from a draft issue Feb 26, 2023

mostafa self-assigned this Feb 26, 2023

mostafa added the enhancement New feature or request label Feb 26, 2023

mostafa changed the title ~~Clustering~~ Clustering and service mesh Feb 26, 2023

mostafa changed the title ~~Clustering and service mesh~~ High availability Oct 31, 2023

mostafa moved this from ✨ New to 📋 Backlog in GatewayD Core Public Roadmap Oct 31, 2023

mostafa added this to the v0.9.x milestone Oct 31, 2023

mostafa added the epic To be broken down into multiple tasks label Nov 1, 2023

mostafa removed their assignment Dec 9, 2023

mostafa added the needs investigation Investigation is needed to flesh out the details and possibly create new tickets label May 1, 2024

mostafa mentioned this issue Sep 25, 2024

Gatewayd is no longer stateless after adding LoadBalancer gatewayd-io/helm-charts#16

Open

mostafa modified the milestones: v0.9.x, v0.10.x Oct 15, 2024

sinadarbouy mentioned this issue Oct 21, 2024

Implement Raft-based State Synchronization for GatewayD Instances #628

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High availability #169

High availability #169

mostafa commented Feb 26, 2023 •

edited

Loading

sinadarbouy commented Oct 17, 2024

mostafa commented Oct 17, 2024 •

edited

Loading

sinadarbouy commented Oct 17, 2024

High availability #169

High availability #169

Comments

mostafa commented Feb 26, 2023 • edited Loading

Resources

sinadarbouy commented Oct 17, 2024

mostafa commented Oct 17, 2024 • edited Loading

sinadarbouy commented Oct 17, 2024

mostafa commented Feb 26, 2023 •

edited

Loading

mostafa commented Oct 17, 2024 •

edited

Loading