Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support external connectivity #61

Closed
EronWright opened this issue Oct 6, 2018 · 0 comments · Fixed by #77
Closed

Support external connectivity #61

EronWright opened this issue Oct 6, 2018 · 0 comments · Fixed by #77
Assignees
Labels
kind/enhancement Enhancement of an existing feature priority/P0 Data loss/corruption, catastrophic failure, security, functionality lost, permanent damage status/ready The issue is ready to be worked on; or the PR is ready to review

Comments

@EronWright
Copy link
Contributor

EronWright commented Oct 6, 2018

Overview

The Pravega clusters that are produced by the operator should support external connectivity (i.e. connectivity from outside the Kubernetes cluster). The specific endpoints in question are the controller RPC/REST ports, and the segment store RPC port.

Challenges

Pravega ingests data directly from client to a dynamic set of segment stores, unlike a conventional service that relies on a stable, load-balanced endpoint. The client discovers the segment stores with the help of the controller, who's aware of active segment stores and their endpoint addresses. Specific challenges include:

  • advertising usable addresses to the client
  • facilitating transport encryption (TLS) to the segment store (e.g. supporting hostname verification)
  • optimizing internal connectivity vs external connectivity (e.g. avoiding an expensive route when possible)

Vendor Specifics

  • PKS: has option to use NSXT for Ingress. Istio is apparently on the roadmap.
  • GKE: see references at bottom

Implementation

For conventional services, external connectivity is generally accomplished with an Ingress resource. Ingress primarily supports HTTP(s) and it is unclear whether gRPC (which is HTTP/2-based) is supported (ref).

Ingress is probably not suitable for exposing the segment store. For workloads that are similar to Pravega, e.g. Kafka, the typical solution is to use a NodePort.

Keep in mind that Ingress and services of type LoadBalancer may incur additional costs in cloud environments (GCP pricing).

Multiple Advertised Addresses

Certain Pravega clients will be internal to the cluster, others external. Imagine that the segment store advertised only an external address (backed by a NodePort or other type of service); would the performance of internal clients suffer due to a needlessly expensive route? A mitigation would be to introduce support for numerous advertised advertised addresses ("internal"/"external"). Given a prioritized list, the client could strive to connect to the cheapest endpoint.

This idea could extend to full-fledged multi-homing, where the segment store binds to a separate interface/port per endpoint, possibly with a separate SSL configuration per endpoint.

NodePort Details

Be sure to set the externalTrafficPolicy field of the Service to local. This will ensure that traffic entering a given VM will be routed to the segment store on that same VM.

One limitation of NodePort is that only a single segment store may be scheduled on a given cluster node. If multiple were to be scheduled, some would fail with a port-unavailable error. One way to avoid this is to use a DaemonSet to manage the segment store pods.

References

@EronWright EronWright added the kind/enhancement Enhancement of an existing feature label Oct 6, 2018
@adrianmo adrianmo added priority/P0 Data loss/corruption, catastrophic failure, security, functionality lost, permanent damage status/ready The issue is ready to be worked on; or the PR is ready to review labels Nov 21, 2018
@fpj fpj closed this as completed in #77 Nov 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Enhancement of an existing feature priority/P0 Data loss/corruption, catastrophic failure, security, functionality lost, permanent damage status/ready The issue is ready to be worked on; or the PR is ready to review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants