Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to configure envoyfilter to support ratelimit in istio 1.5.0? #22068

Closed
sd797994 opened this issue Mar 11, 2020 · 81 comments
Closed

how to configure envoyfilter to support ratelimit in istio 1.5.0? #22068

sd797994 opened this issue Mar 11, 2020 · 81 comments
Assignees
Milestone

Comments

@sd797994
Copy link

because of the mixer policy was deprecated in Istio 1.5,officials suggested use envoy rate limiting instead of mixer rate limiting 。but we don't have any document to guide us how to configure envoyfilter support ratelimit, the native envoy ratelimit configure like this:
image
but how to configure istio envoyfilter make it work?

@catman002
Copy link

@sd797994
If solved, tell me, thank you

@sd797994
Copy link
Author

sd797994 commented Mar 13, 2020

@catman002 there is an envoy ratelimit example: https://github.com/jbarratt/envoy_ratelimit_example can help you I hope。simple strategies only, if your mixer policy are not too complicated...
I was successed to run this example. but it need Injection configure when the sidercar(dockerimage:envoyproxy/envoy-alpine:latest) to starting(copy config.yaml in to the right path like '/data/ratelimit/config/'), this is so many different from istio's envoy. I try to contrast istio envoy container and this envoy container ,I dont find any way to Injection configure by istio envoy. so.....

@bianpengyuan
Copy link
Contributor

@gargnupur Is there work going on to provide an example set up using envoy rate limit filter?

@devstein
Copy link

@bianpengyuan @gargnupur After much trial and error, here is a working template for rate-limiting for the default Istio Ingress gateway

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: test
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_service
          type: STRICT_DNS
          connect_timeout: 0.25s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: ratelimit.default.svc.cluster.local
                port_value: 8081
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                # Multiple actions nest the descriptors
                # https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter#config-http-filters-rate-limit-composing-actions
                # - generic_key:
                    # descriptor_value: "test"
                - request_headers:
                    header_name: "Authorization"
                    descriptor_key: "auth"
                # - remote_address: {}
                # - destination_cluster: {}

@jsenon
Copy link
Member

jsenon commented Mar 26, 2020

Do you have any plan in order to reduce the complexity of the configuration for the rate limit? Seems a feature of a service mesh and implemented in other ones.

@gargnupur
Copy link
Contributor

@jsenon : would like to know the pain points you are facing as that would help us know what we need to improve and we can take care of it in the next release of Istio...

@gargnupur
Copy link
Contributor

@devstein : Great it worked for you and thanks for the example!! Can you share any problems that you faced or improvements that you would like to see...
@bianpengyuan, @sd797994 , @catman002 , @jsenon : I will be working on steps for using envoy ratelimiting for Istio services and will share here soon.

@gargnupur gargnupur self-assigned this Mar 27, 2020
@jsenon
Copy link
Member

jsenon commented Mar 27, 2020

Hi @gargnupur thanks for your reply. Gloo or ambassador, have implemented a simple way of configuration, why do not add the rate limiting feature in the virtual service, or have only one CRD rate-limiter that will translate simple user configuration to envoy proxy:

  rate-limit:
    ratelimit_server_ref:
      name: #Rate limiter URL      
      namespace: #Rate limiter namespace 
    request_timeout: #Time out of limiter  
    deny_on_fail: #Do we accept if no answer from limiter server     
    rate_limit:
     maxAmount: #Number of request
     ValidDuration: #Bucket duration

@devstein
Copy link

Hi @gargnupur thanks for tackling this! The two biggest challenges I faced were:

  1. Understanding Envoy's concept of a cluster and how that related to the ratelimit service I deployed. I would have hoped that I could have referenced the service directly using usual ratelimit.default.svc.cluster.local K8s syntax. It's still unclear to me if this is intended or a bug.

  2. Debugging. To properly debug this filter, I had to look at examples online of what the raw envoy configuration for the rate limiting filter should look like then use istioctl prox-config to check if the EnvoyFilters I applied modified the config to be as such. I also ran into an issue where the rate limit actions I applied didn't have a match in the rate limit service's config but I couldn't find any logs for this.

Let me know if I can help in any other way!

@songford
Copy link

songford commented Apr 2, 2020

@bianpengyuan @gargnupur After much trial and error, here is a working template for rate-limiting for the default Istio Ingress gateway

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: test
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_service
          type: STRICT_DNS
          connect_timeout: 0.25s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: ratelimit.default.svc.cluster.local
                port_value: 8081
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                # Multiple actions nest the descriptors
                # https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter#config-http-filters-rate-limit-composing-actions
                # - generic_key:
                    # descriptor_value: "test"
                - request_headers:
                    header_name: "Authorization"
                    descriptor_key: "auth"
                # - remote_address: {}
                # - destination_cluster: {}

If you don't mind me asking, how would you pass in the Lyft config into these EnvoyFilters? Like

- key: header_match
  value: quote-path-auth
  rate_limit:
    unit: minute
    requests_per_unit: 2

@devstein From the snippet you kindly provided, I can only see the filters to match for certain header. But where did you put the corresponding configuration in regards to how many requests per unit time is allowed? Thanks!

@devstein
Copy link

devstein commented Apr 2, 2020

@songford An example rate limit config for snipped I provided would be:

domain: test
descriptors:
   # match the descriptor_key from the EnvoyFilter
  - key: auth
    # Do not include a value unless you know what auth value you want to rate limit (i.e a specific API_KEY)
    rate_limit: # describe the rate limit 
      unit: minute
      requests_per_unit: 60

This config is loaded by the ratelimit service you defined in envoy.ratelimit


If you wanted to filter by remote_address: {} then you could have the following config:

domain: test
descriptors:
  # Naively rate-limit by IP
  - key: remote_address
    rate_limit:
      unit: minute
      requests_per_unit: 60

I hope this helps!

@songford
Copy link

songford commented Apr 3, 2020

@devstein Thanks a lot! It really helps!
My configuration (modification) based on the config from @devstein:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
    # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.rate_limit
          config:
            # domain can be anything! Match it to the ratelimter service config
            domain: {your_domain_name}
            failure_mode_deny: true
            rate_limit_service:
              grpc_service:
                envoy_grpc:
                  cluster_name: rate_limit_cluster
                timeout: 10s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_cluster
          type: STRICT_DNS
          connect_timeout: 10s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: ratelimit.default.svc.cluster.local
                port_value: 8081
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: HTTP_ROUTE
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: {your_domain_name}
            route:
              name: http-echo-service
              action: ANY
      patch:
        operation: MERGE
        value:
          route:
            rate_limits:
              - actions: # any actions in here
                  # Multiple actions nest the descriptors
                  # - generic_key:
                  # descriptor_value: "test"
                  - {request_headers: {header_name: "user-agent", descriptor_key: "auth_key"}}
                  # - remote_address: {}
                  # - destination_cluster: {}

Several points that I wish to bring up in hope to help folks who run into this post with similar requirements:

  1. In my case, the domain has to match domain you would like to enforce the rate limit on. For instance, if a rate limiter needs to apply on https://maps.google.com/v2, the domain configuration has to match this domain name.
  2. For some reasons, in the filter-ratelimit-svc configuration, I have to make the applyTo into HTTP_ROUTE instead of VIRTUAL_HOST, otherwise the request_header section will be injected two layers too shallow. From the official documentation, this section should sit under virtual_hosts.routes.route, which is the case when applying to HTTP_ROUTE. If I use VIRTUAL_HOST, the section will be inserted right under virtual_hosts. I haven't myself verified if it does make a difference.
    3. YOU SHOULD NOT LET ISTIO INJECT AN ISTIO SIDECAR NEXT TO YOUR RATE LIMITER SERVICE!!!
    Correction: You can let istio inject a sidecar to the rate limit service pod. But you should remember to name the port it uses to receive gRPC calls (normally 8081) accordingly in its corresponding service. (grpc-8081)
  3. Perhaps it's useful to set failure_mode_deny to true if you run into trouble. You'll know it when the rate limit service stops cooperating, as all requests you send to envoy will return 500 error. If you read the access log you'll see RLSE, which stands for Rate Limiter Service Error (I guess?), and you'll know the rate limiter service has been wired into the loop.
  4. Use istioctl dashboard envoy {{the-envoy-pod-your-ratelimiter-applys-on}} to dump the configuration that's actually being written into Envoy and carefully review it.

@guyromb
Copy link

guyromb commented Apr 23, 2020

Any plans to support this natively in istio?

@VinothChinnadurai
Copy link

VinothChinnadurai commented Apr 26, 2020

@songford @bianpengyuan @gargnupur @devstein Can some body look this below configuration and help us? It does not created routes entry(RDS) in envoy config_dump but cluster entry(CDS) is there.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: abcdefghi.xxx.com
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: "outbound|81||vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com"
               timeout: 10s

---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
            rate_limits:
              - actions: # any actions in here
                  # Multiple actions nest the descriptors
                  # - generic_key:
                  # descriptor_value: "test"
                  - {request_headers: {header_name: "method", descriptor_key: "GET"}}
                  - {request_headers: {header_name: "path", descriptor_key: "/api/v2/tickets"}}
                  - {request_headers: {header_name: "host", descriptor_key: "abcdefghi.xxx.com"}}
                  - {request_headers: {header_name: "x-request-id", descriptor_key: "ac5b684b-4bc6-4474-a943-0de4f1faf8df"}}
                  - {request_headers: {header_name: "domain", descriptor_key: "xxxxx"}}
                  # - remote_address: {}
                  # - destination_cluster: {}

Our Service Entry:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
 name: endpoint-new
 namespace: default
spec:
 hosts:
 - vpce-xxx-xx.vpce-svc-xx.us-east-1.vpce.amazonaws.com
 location: MESH_EXTERNAL
 ports:
 - name: grpc
   number: 81
   protocol: GRPC
 resolution: DNS
 endpoints:
 - address: vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com

@gargnupur
Copy link
Contributor

gargnupur commented Apr 27, 2020

@VinothChinnadurai: can you share your config_dump ? Config looks ok..
Your ratelimit service is hosted on vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com?

For reference: I followed examples above and this has been working for me: https://github.com/istio/istio/compare/master...gargnupur:nup_try_ratelimit_envoy?expand=1#diff-87007efb70dda4500545ba652cb0b30e

@devstein
Copy link

@VinothChinnadurai

What does your rate limit service config look like? Have you tried simplifying your rate limit actions as a sanity check? (i.e only use - remote_address: {})

Also, did you try explicitly to create a CLUSTER definition for the service? I found this to be simpler/less error-prone than referencing the default generated cluster name.

If you can, post your istioctl proxy-config route $POD_NAME output here.

  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: test
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_service
          type: STRICT_DNS
          connect_timeout: 0.25s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: endpoint-new.default.svc.cluster.local
                port_value: 81

@VinothChinnadurai
Copy link

@gargnupur @devstein First of all, thanks a lot for your responses.

@gargnupur
Your ratelimit service is hosted on vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com?
Yes it is our ratelimit service endpoint.
Config_dump
https://gist.github.com/VinothChinnadurai/66561838310c63b6a7657b0cde6fc194

@devstein
What does your rate limit service config look like? Have you tried simplifying your rate limit actions as a sanity check? (i.e only use - remote_address: {})
I am not exactly getting this point. please briefly explain. Our ratelimit is a GRPC service and tested the reachability from worker node
We tried the reachability through the below call(and its reachable)

curl vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com:81/v1/accounts/sample

This is the host on which we are trying to apply ratelimit: abcdefghi.xxx.com

istioctl proxy-config route istio-ingressgateway-598796f4d9-h4vh6 -n istio-system
NOTE: This output only contains routes loaded via RDS.
NAME        VIRTUAL HOSTS
http.80     1
            1
istioctl proxy-config route istio-ingressgateway-598796f4d9-h4vh6 -n istio-system --name http.80 -o json
[
    {
        "name": "http.80",
        "virtualHosts": [
            {
                "name": "*:80",
                "domains": [
                    "*",
                    "*:80"
                ],
                "routes": [
                    {
                        "match": {
                            "prefix": "/api/v2/activities",
                            "caseSensitive": true
                        },
                        "route": {
                            "cluster": "outbound|80||twilight-service.twilight-istio.svc.cluster.local",
                            "timeout": "0s",
                            "retryPolicy": {
                                "retryOn": "connect-failure,refused-stream,unavailable,cancelled,resource-exhausted,retriable-status-codes",
                                "numRetries": 2,
                                "retryHostPredicate": [
                                    {
                                        "name": "envoy.retry_host_predicates.previous_hosts"
                                    }
                                ],
                                "hostSelectionRetryMaxAttempts": "5",
                                "retriableStatusCodes": [
                                    503
                                ]
                            },
                            "maxGrpcTimeout": "0s"
                        },
                        "metadata": {
                            "filterMetadata": {
                                "istio": {
                                    "config": "/apis/networking.istio.io/v1alpha3/namespaces/twilight-istio/virtual-service/twilight-vs"
                                }
                            }
                        },
                        "decorator": {
                            "operation": "twilight-service.twilight-istio.svc.cluster.local:80/api/v2/activities*"
                        }
                    },
                    {
                        "match": {
                            "prefix": "/api/_/email_bots",
                            "caseSensitive": true
                        },
                        "route": {
                            "cluster": "outbound|80||emailbot-service.emailbot.svc.cluster.local",
                            "timeout": "0s",
                            "retryPolicy": {
                                "retryOn": "connect-failure,refused-stream,unavailable,cancelled,resource-exhausted,retriable-status-codes",
                                "numRetries": 2,
                                "retryHostPredicate": [
                                    {
                                        "name": "envoy.retry_host_predicates.previous_hosts"
                                    }
                                ],
                                "hostSelectionRetryMaxAttempts": "5",
                                "retriableStatusCodes": [
                                    503
                                ]
                            },
                            "maxGrpcTimeout": "0s"
                        },
                        "metadata": {
                            "filterMetadata": {
                                "istio": {
                                    "config": "/apis/networking.istio.io/v1alpha3/namespaces/emailbot/virtual-service/emailbot-vs"
                                }
                            }
                        },
                        "decorator": {
                            "operation": "emailbot-service.emailbot.svc.cluster.local:80/api/_/email_bots*"
                        }
                    }
                ],
                "rateLimits": [
                    {
                        "actions": [
                            {
                                "requestHeaders": {
                                    "headerName": "method",
                                    "descriptorKey": "GET"
                                }
                            },
                            {
                                "requestHeaders": {
                                    "headerName": "path",
                                    "descriptorKey": "/api/v2/tickets"
                                }
                            },
                            {
                                "requestHeaders": {
                                    "headerName": "host",
                                    "descriptorKey": "abcdefghi.xxx.com"
                                }
                            },
                            {
                                "requestHeaders": {
                                    "headerName": "x-request-id",
                                    "descriptorKey": "ac5b684b-4bc6-4474-a943-0de4f1faf8df"
                                }
                            },
                            {
                                "requestHeaders": {
                                    "headerName": "domain",
                                    "descriptorKey": "xxx"
                                }
                            }
                        ]
                    }
                ]
            }
        ],
        "validateClusters": false
    }
]

Also, did you try explicitly to create a CLUSTER definition for the service? I found this to be simpler/less error-prone than referencing the default generated cluster name.
We tried that but found the below issue

The EnvoyFilter "filter-ratelimit" is invalid: []: Invalid value: map[string]interface {}{"apiVersion":"networking.istio.io/v1alpha3", "kind":"EnvoyFilter", "metadata":map[string]interface {}{"annotations":map[string]interface {}{"kubectl.kubernetes.io/last-applied-configuration":"{"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"name":"filter-ratelimit","namespace":"istio-system"},"spec":{"configPatches":[{"applyTo":"HTTP_FILTER","match":{"context":"GATEWAY","listener":{"filterChain":{"filter":{"name":"envoy.http_connection_manager","subFilter":{"name":"envoy.router"}}}}},"patch":{"operation":"INSERT_BEFORE","value":{"config":{"domain":"abcdefghi.freshpo.com","failure_mode_deny":true,"rate_limit_service":{"grpc_service":{"envoy_grpc":{"cluster_name":"rate_limit_cluster"},"timeout":"10s"}}},"name":"envoy.rate_limit"}}},{"applyTo":"CLUSTER","match":{"context":"GATEWAY"},"patch":{"operation":"ADD","value":{"connect_timeout":"10s","hosts":[{"socket_address":{"address":"vpce-0b247209ae0145d88-4fa54j71.vpce-svc-0874c1e9512bd57dc.us-east-1.vpce.amazonaws.com","port_value":81}}],"http2_protocol_options":{},"lb_policy":"ROUND_ROBIN","name":"rate_limit_cluster","type":"STRICT_DNS"}}}],"workloadSelector":{"labels":{"istio":"ingressgateway"}}}}\n"}, "creationTimestamp":"2020-04-09T07:09:16Z", "generation":1, "name":"filter-ratelimit", "namespace":"istio-system", "uid":"065b9b71-7a31-11ea-bcfc-0e6d31531fe3"}, "spec":map[string]interface {}{"configPatches":[]interface {}{map[string]interface {}{"applyTo":"HTTP_FILTER", "match":map[string]interface {}{"context":"GATEWAY", "listener":map[string]interface {}{"filterChain":map[string]interface {}{"filter":map[string]interface {}{"name":"envoy.http_connection_manager", "subFilter":map[string]interface {}{"name":"envoy.router"}}}}}, "patch":map[string]interface {}{"operation":"INSERT_BEFORE", "value":map[string]interface {}{"config":map[string]interface {}{"domain":"abcdefghi.freshpo.com", "failure_mode_deny":true, "rate_limit_service":map[string]interface {}{"grpc_service":map[string]interface {}{"envoy_grpc":map[string]interface {}{"cluster_name":"rate_limit_cluster"}, "timeout":"10s"}}}, "name":"envoy.rate_limit"}}}, map[string]interface {}{"applyTo":"CLUSTER", "match":map[string]interface {}{"context":"GATEWAY"}, "patch":map[string]interface {}{"operation":"ADD", "value":map[string]interface {}{"connect_timeout":"10s", "hosts":[]interface {}{map[string]interface {}{"socket_address":map[string]interface {}{"address":"vpce-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.amazonaws.com", "port_value":81}}}, "http2_protocol_options":map[string]interface {}{}, "lb_policy":"ROUND_ROBIN", "name":"rate_limit_cluster", "type":"STRICT_DNS"}}}}, "workloadSelector":map[string]interface {}{"labels":map[string]interface {}{"istio":"ingressgateway"}}}}: validation failure list:
"spec.configPatches.match" must validate one and only one schema (oneOf). Found none valid
spec.configPatches.match.listener in body is required

Kindly unblock us by suggesting what is the issue here..

@devstein
Copy link

What does your rate limit service config look like?

I was referring to the envoy proxy ratelimit service but I see you are using a custom GRPC service.

Have you tried simplifying your rate limit actions as a sanity check? (i.e only use - remote_address: {})

I'm referring to simplifying the rate limit actions. See below

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
            rate_limits:
              - actions:
                  - remote_address: {}

We tried that but found the below issue

What version of Istio are you using?

@VinothChinnadurai Your route definition looks correct. Unfortunately, I'm not sure as to what your issue is. As a next step, I suggest enabling debug level logging on your ingress gateway pod to see what is going on.

kubectl -n istio-system exec svc/istio-ingressgateway -- curl -X POST "localhost:15000/logging?filter=debug" -s
kubectl -n istio-system logs svc/istio-ingressgateway -f
# make requests via another terminal 

@VinothChinnadurai
Copy link

@devstein @gargnupur
Sorry We are also using envoyproxy ratelimit spec only. https://github.com/envoyproxy/envoy/blob/master/api/envoy/service/ratelimit/v2/rls.proto
we are using istio 1.5.0

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
    # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.rate_limit
          config:
            # domain can be anything! Match it to the ratelimter service config
            domain: abcdefghi.freshpo.com
            failure_mode_deny: true
            rate_limit_service:
              grpc_service:
                envoy_grpc:
                  cluster_name: rate_limit_cluster
                timeout: 10s
    - applyTo: CLUSTER
      match:
        cluster:
          service: vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com
      patch:
        operation: ADD
        value:
          name: rate_limit_cluster
          type: STRICT_DNS
          connect_timeout: 10s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
          - socket_address:
              address: vpce-xxx-xxx.vpce-svc-xxx.us-east-1.vpce.amazonaws.com
              port_value: 81
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                  # Multiple actions nest the descriptors
                  # - generic_key:
                  # descriptor_value: "test"
              - {request_headers: {header_name: "method", descriptor_key: "GET"}}
              - {request_headers: {header_name: "path", descriptor_key: "/api/v2/tickets"}}
              - {request_headers: {header_name: "host", descriptor_key: "abcdefghi.xxx.com"}}
              - {request_headers: {header_name: "x-request-id", descriptor_key: "ac5b684b-4bc6-4474-a943-0de4f1faf8df"}}
              - {request_headers: {header_name: "domain", descriptor_key: "xxx"}}
                  # - remote_address: {}
                  # - destination_cluster: {} 

The above one applied without any issue

We tried sanity using remote_address:{} only as you mentioned and call reaches to our ratelimit service :)
Please find debug log for this:
https://gist.github.com/Adharsh-Muraleedharan/6d844d1d1cb8d5db0bd35e80749ba6b9

But if we try with necessary headers(removed the remote_address:{})(as like above manifests)
The call not reaches to our ratelimit service and we can't find the request entry in the below debug logs
https://gist.github.com/Adharsh-Muraleedharan/55680ab723763f86664aafbf4e0839cc

Does that mean, the issue is with headers?

Kindly suggest what is the issue here?

@ramaraochavali
Copy link
Contributor

Does all the headers have values in the actual request? Please note if the request does not have value for any of those headers, Envoy skips calling the rate limit service. See this issue envoyproxy/envoy#10124 in Envoy

@VinothChinnadurai
Copy link

Sure @ramaraochavali . Lets check with my ratelimit service team and try only with supported headers and come back.

@VinothChinnadurai
Copy link

@ramaraochavali @devstein @gargnupur Thanks a lot guys for all your responses. It is working now as if we pass all headers in the request matched with request_headers(Under rate_limits.actions)

I have two questions here.

patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                  # Multiple actions nest the descriptors
                  # - generic_key:
                  # descriptor_value: "test"
              - {request_headers: {header_name: ":authority", descriptor_key: "host"}}
              - {request_headers: {header_name: ":path", descriptor_key: "PATH"}}

One this header_name should match with what we sending in the request to this IG envoy and descriptor_key is something we will make it as ** header ** for all outbound request with descriptor_value as its value.

Say in the above case
if request-> IstioGateway sent as

curl https://ig.com --header "host:abcd.xxx.com" --header "path: /api/v2/tickets"

It will become {"host":"abcd.xxx.com","PATH":"/api/v2/tickets"} as request while sending to our Ratemit service from IstioGateway?

  1. Can I skip this rate limit calls based on some header value present in my incoming request to IstioGateway? (we need some mechanism to skip for a certain type of request)

Kindly suggest.

Thanks once again!!!

@gargnupur
Copy link
Contributor

@VinothChinnadurai : yes for the first question.
for second one take a look at https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/route/route_components.proto#envoy-api-msg-route-ratelimit-action-headervaluematch, you can use this for ratelimiting based on the presence of a header value I think...

@VinothChinnadurai
Copy link

Thanks a lot @gargnupur . I will try the same and come back

@i-prudnikov
Copy link

i-prudnikov commented Oct 21, 2020

@JaveriaK Ahhhh OK.

Yes, if you see RLSE in envoy log, that means the ratelimiter config is valid and recognized by Pilot. With failure_mode_deny=false, the request will be accepted even if Pilot fails to reach rate limit service. If you toggled debug mode in your rate limit service, there will be an entry corresponding to every request you submitted to the workload. So I assume your assumption is correct, there is something wrong with the config.

Perhaps if you can check for yourself if there is another rate_limit_service cluster definition hiding somewhere in your envoy proxies? Duplicate cluster rate_limit_service found while pushing CDS is very fishy. Or if you don't mind, attaching the config of interest here if it's not too long so we can jump in and help.

@JaveriaK, @songford
I believe I narrow down the cause of Duplicate cluster rate_limit_service found while pushing CDS message.
So the rate limit filter patches cluster, adding new cluster 'rate_limit_service':

  - applyTo: CLUSTER
    match:
      cluster:
        service: ratelimit.rate-limit.svc.cluster.local
    patch:
      operation: ADD
      value:
        name: rate_limit_service
        type: STRICT_DNS
        connect_timeout: 0.25s
        lb_policy: ROUND_ROBIN
        http2_protocol_options: {}
        hosts:
        - socket_address:
            address: ratelimit.rate-limit.svc.cluster.local
            port_value: 8081

If we comment out this portion of envoy filter, no warnings in the log anymore.

Let's clarify this as, actually when you deploy reate-limit service in k8s, Istio automaticlly recognize it and adds into clusters of envoy, the raw config looks like this:

 {
     "version_info": "2020-10-19T11:25:53Z/83",
     "cluster": {
      "@type": "type.googleapis.com/envoy.api.v2.Cluster",
      "name": "outbound|8080||ratelimit.rate-limit.svc.cluster.local",
      "type": "EDS",
      "eds_cluster_config": {
       "eds_config": {
        "ads": {}
       },
       "service_name": "outbound|8080||ratelimit.rate-limit.svc.cluster.local"
      },
      "connect_timeout": "10s",
      "circuit_breakers": {
       "thresholds": [
        {
         "max_connections": 4294967295,
         "max_pending_requests": 4294967295,
         "max_requests": 4294967295,
         "max_retries": 4294967295
        }
       ]
      },
      "http2_protocol_options": {
       "max_concurrent_streams": 1073741824
      },
      "protocol_selection": "USE_DOWNSTREAM_PROTOCOL",
      "filters": [
       {
        "name": "istio.metadata_exchange",
        "typed_config": {
         "@type": "type.googleapis.com/udpa.type.v1.TypedStruct",
         "type_url": "type.googleapis.com/envoy.tcp.metadataexchange.config.MetadataExchange",
         "value": {
          "protocol": "istio-peer-exchange"
         }
        }
       }
      ],
      "transport_socket_matches": [
       {
        "name": "tlsMode-istio",
        "match": {
         "tlsMode": "istio"
        },
        "transport_socket": {
         "name": "envoy.transport_sockets.tls",
         "typed_config": {
          "@type": "type.googleapis.com/envoy.api.v2.auth.UpstreamTlsContext",
          "common_tls_context": {
           "alpn_protocols": [
            "istio-peer-exchange",
            "istio",
            "h2"
           ],
           "tls_certificate_sds_secret_configs": [
            {
             "name": "default",
             "sds_config": {
              "api_config_source": {
               "api_type": "GRPC",
               "grpc_services": [
                {
                 "envoy_grpc": {
                  "cluster_name": "sds-grpc"
                 }
                }
               ]
              }
             }
            }
           ],
           "combined_validation_context": {
            "default_validation_context": {
             "match_subject_alt_names": [
              {
               "exact": "spiffe://cluster.local/ns/rate-limit/sa/default"
              }
             ]
            },
            "validation_context_sds_secret_config": {
             "name": "ROOTCA",
             "sds_config": {
              "api_config_source": {
               "api_type": "GRPC",
               "grpc_services": [
                {
                 "envoy_grpc": {
                  "cluster_name": "sds-grpc"
                 }
                }
               ]
              }
             }
            }
           }
          },
          "sni": "outbound_.8080_._.ratelimit.rate-limit.svc.cluster.local"
         }
        }
       },
       {
        "name": "tlsMode-disabled",
        "match": {},
        "transport_socket": {
         "name": "envoy.transport_sockets.raw_buffer"
        }
       }
      ]
     },
     "last_updated": "2020-10-19T11:26:40.458Z"
    }

Envoy filter , when applied adds this portion of cluster config:

{
     "version_info": "2020-10-21T09:35:44Z/7",
     "cluster": {
      "@type": "type.googleapis.com/envoy.api.v2.Cluster",
      "name": "rate_limit_service",
      "type": "STRICT_DNS",
      "connect_timeout": "0.250s",
      "hosts": [
       {
        "socket_address": {
         "address": "ratelimit.rate-limit.svc.cluster.local",
         "port_value": 8081
        }
       }
      ],
      "http2_protocol_options": {}
     },
     "last_updated": "2020-10-21T09:35:44.779Z"
    }

So the 2 clusters are about the same destination, and pilot somehow treated them as duplicates?
Can we safely ignore this warning?

@jdomag
Copy link

jdomag commented Oct 26, 2020

Hi Guys,

Any idea if there is a way to use rateLimit for outbound https traffic? E.g. I would like to have only 5 requests per second from pod XYZ to https://google.com?
Is it doable? I assume that since it's HTTPS traffic all the HTTP headers will be encrypted and i can't use envoy descriptors to route it to the rateLimit service

@gargnupur
Copy link
Contributor

@jdomag : You can use other envoy descriptors like remote_address that are not dependent on HTTP headers?

@Aryan-NBS
Copy link

@catman002 there is an envoy ratelimit example: https://github.com/jbarratt/envoy_ratelimit_example can help you I hope。simple strategies only, if your mixer policy are not too complicated...
I was successed to run this example. but it need Injection configure when the sidercar(dockerimage:envoyproxy/envoy-alpine:latest) to starting(copy config.yaml in to the right path like '/data/ratelimit/config/'), this is so many different from istio's envoy. I try to contrast istio envoy container and this envoy container ,I dont find any way to Injection configure by istio envoy. so.....

Did you try in Istio 1.7.* ?

@Aryan-NBS
Copy link

@bianpengyuan @gargnupur After much trial and error, here is a working template for rate-limiting for the default Istio Ingress gateway

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: test
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_service
          type: STRICT_DNS
          connect_timeout: 0.25s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: ratelimit.default.svc.cluster.local
                port_value: 8081
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                # Multiple actions nest the descriptors
                # https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter#config-http-filters-rate-limit-composing-actions
                # - generic_key:
                    # descriptor_value: "test"
                - request_headers:
                    header_name: "Authorization"
                    descriptor_key: "auth"
                # - remote_address: {}
                # - destination_cluster: {}

Did you try on Istio 1.7.*

@Aryan-NBS
Copy link

4. istioctl dashboard envoy {{the-envoy-pod-your-ratelimiter-applys-on}}

@songford Did you try on Istio 1.7.4 ?

@hzxuzhonghu
Copy link
Member

I have designed an API, appreciate if anyone can leave your comments
https://docs.google.com/document/d/1628LHcwuCvTRFhk8rsQKQUmgCxnY6loPFSxliADQDIc/edit#

@Omasumasu
Copy link

Hi
Does anyone know how to set rate limit with using cookie value?
I've tried to use header_to_metadata filter and changed header value to dynamic metadata.
But rate limit with dynamic meta data was not working proper.
Rate limit service working proper with header matcher.

This process is need for me, because I want to use cookie to metadata filter which is available from envoy v1.16.
https://www.envoyproxy.io/docs/envoy/v1.16.0/api-v3/extensions/filters/http/header_to_metadata/v3/header_to_metadata.proto.html?highlight=header_to_meta#extensions-filters-http-header-to-metadata-v3-config-rule

This is my sample configuration.
Is there any missing?

---
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: GATEWAY
      listener:
        filterChain:
          filter:
            name: envoy.http_connection_manager
            subFilter:
              name: envoy.router
    patch:
      operation: INSERT_BEFORE
      value:
        name: envoy.rate_limit
        typed_config:
          '@type': type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
          domain: ratelimit
          failure_mode_deny: false
          rate_limit_service:
            grpc_service:
              envoy_grpc:
                cluster_name: rate_limit_cluster
          timeout: 0.25s
  - applyTo: CLUSTER
    match:
      cluster:
        service: ratelimit.rate-limit.svc.cluster.local
    patch:
      operation: ADD
      value:
        connect_timeout: 0.25s
        http2_protocol_options: {}
        lb_policy: ROUND_ROBIN
        load_assignment:
          cluster_name: rate_limit_cluster
          endpoints:
          - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: ratelimit.rate-limit.svc.cluster.local
                    port_value: 8081
        name: dev-rate_limit_cluster
        type: STRICT_DNS
  workloadSelector:
    labels:
--
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  configPatches:
  - applyTo: VIRTUAL_HOST
    match:
      context: GATEWAY
      routeConfiguration:
        vhost:
          name: *80
          route:
            action: ANY
    patch:
      operation: MERGE
      value:
        rate_limits:
        - actions:
          - dynamic_metadata:
              descriptor_key: user
              metadata_key:
                key: envoy.lb
                path:
                - key: cookie
  workloadSelector:
    labels:
      istio: ingressgateway
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: header-to-meta-filter
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.header_metadata
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.http.header_to_metadata.v3.Config
            request_rules:
              - header: cookie
                on_header_present:
                  metadata_namespace: envoy.lb
                  key: cookie
                  type: STRING
                remove: false

ConfigMap of ratelimit service

apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
  namespace: rate-limit
data:
  config.yaml: |
    domain: ratelimit
    descriptors:
      - key: user
        rate_limit:
          unit: minute
          requests_per_unit: 5

@jdomag
Copy link

jdomag commented Nov 29, 2020

@jdomag : You can use other envoy descriptors like remote_address that are not dependent on HTTP headers?

I've decided to use Egress TLS origination and Envoy Rate Limiting for particular service. I've described this in more details in below article if anybody is interested:
https://domagalski-j.medium.com/istio-rate-limits-for-egress-traffic-8697df490f68

@adityashanbhog
Copy link

@bianpengyuan @gargnupur After much trial and error, here is a working template for rate-limiting for the default Istio Ingress gateway

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    # select by label in the same namespace
    labels:
      istio: ingressgateway
  configPatches:
      # The Envoy config you want to modify
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
              subFilter:
                name: "envoy.router"
      patch:
        operation: INSERT_BEFORE
        value:
         name: envoy.rate_limit
         config:
           # domain can be anything! Match it to the ratelimter service config
           domain: test
           rate_limit_service:
             grpc_service:
               envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
    - applyTo: CLUSTER
      match:
        cluster:
          service: ratelimit.default.svc.cluster.local
      patch:
        operation: ADD
        value:
          name: rate_limit_service
          type: STRICT_DNS
          connect_timeout: 0.25s
          lb_policy: ROUND_ROBIN
          http2_protocol_options: {}
          hosts:
            - socket_address:
                address: ratelimit.default.svc.cluster.local
                port_value: 8081
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: VIRTUAL_HOST
      match:
        context: GATEWAY
        routeConfiguration:
          vhost:
            name: "*:80"
            route:
              action: ANY
      patch:
        operation: MERGE
        value:
          rate_limits:
            - actions: # any actions in here
                # Multiple actions nest the descriptors
                # https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/rate_limit_filter#config-http-filters-rate-limit-composing-actions
                # - generic_key:
                    # descriptor_value: "test"
                - request_headers:
                    header_name: "Authorization"
                    descriptor_key: "auth"
                # - remote_address: {}
                # - destination_cluster: {}

If you don't mind me asking, how would you pass in the Lyft config into these EnvoyFilters? Like

- key: header_match
  value: quote-path-auth
  rate_limit:
    unit: minute
    requests_per_unit: 2

@devstein From the snippet you kindly provided, I can only see the filters to match for certain header. But where did you put the corresponding configuration in regards to how many requests per unit time is allowed? Thanks!

Can you please provide the solution you used as we have a similar situation and we are not able to find how to use unit and requests per unit.

@gargnupur
Copy link
Contributor

@A-N-S : Please take a look at working tests in istio repo: https://github.com/istio/istio/blob/master/tests/integration/telemetry/policy/envoy_ratelimit_test.go . It sets up rate limit service using lyft too..

@adityashanbhog
Copy link

adityashanbhog commented Dec 2, 2020

@gargnupur Thanks for the reference. Can you give some details about " {{ .RateLimitNamespace }} " and " {{ .EchoNamespace }} " used in https://github.com/istio/istio/blob/master/tests/integration/telemetry/policy/testdata/enable_envoy_ratelimit.yaml

@gargnupur
Copy link
Contributor

@gargnupur Thanks for the reference. Can you give some details about " {{ .RateLimitNamespace }} " and " {{ .EchoNamespace }} " used in https://github.com/istio/istio/blob/master/tests/integration/telemetry/policy/testdata/enable_envoy_ratelimit.yaml

RateLimitNamespace -> namespace where lyft's redis rate limit service is setup
EchoNamespace -> namespace where echo app is setup

@gargnupur
Copy link
Contributor

We have tests in istio/istio for this, so closing the bug...

@arjunkrishnasb
Copy link

Hi, i followed the official documentation for rate limiting and could not make get the global ratelimiting to work at the Gateway level. I added all the details in #32381
I am totally stuck because the logs does not have any useful info. Can someone help me here ?

@msonnleitner
Copy link

As far as I see, all examples for Envoy Rate limiting add a new cluster and use STRICT_DNS. However, as far as observe, it seems that this circumvents Istio's GRPC Load Balancing, since when auto scaling the rate limiting service, GRPC requests are distributed quite unevenly between the ratelimit 's pods and it seems to use long-living GRPC connections which are going to certain pods which are then overloaded.

When using the Rate Limiting Service with Istio, there is already some kind of Cluster created by Istio

istioctl proxy-config all istio-ingressgateway-5f5f67cdd5-46r2v  -o json
    {
     "version_info": "2021-11-16T13:08:53Z/1270",
     "cluster": {
      "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
      "name": "outbound|8081||ratelimit.api.svc.cluster.local",
      "type": "EDS",
...

Is it possible to somehow use that Cluster with Envoy Rate Limiting? and hoping that GRPC Request balancing works somehow?

@KoJJang
Copy link

KoJJang commented Mar 16, 2023

@msonnleitner
Did you find the solution? I have same problem.
To make traffic distribution evenly after auto scaling, I've try to find the way to create cluster with eds type. But I couldn't find the solution.
Even your work-around(Use Istio created EDS typed cluster) can make traffic distributed to ratelimit service(pods) evenly but ratelimit service doesn't work.(traffic is not blocked with ratelimit config).

Is there any update to make distribute traffic evenly?

@SCLogo
Copy link

SCLogo commented Mar 16, 2023

@KoJJang could you plase share your config? We use ratelimit since 2 years

@msonnleitner
Copy link

msonnleitner commented Mar 16, 2023

@KoJJang Istio's rate limiting documentation was updated sometime ago, now it contains a config which should work:

      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: outbound|8081||ratelimit.default.svc.cluster.local
            authority: ratelimit.default.svc.cluster.local
        transport_api_version: V3 

See the given change here: https://github.com/istio/istio.io/pull/11654/files#diff-b20e3a9583a775ef679a0bc15a53c23aa9b6240757bd369d2ac81760072cd7d8R118

So since Istio's docs was updated to reference that cluster outbound|8081||ratelimit.default.svc.cluster.local, I guess it is safe to assume that this is supported and not just a "hack".

@KoJJang
Copy link

KoJJang commented Mar 16, 2023

@SCLogo
I'm using below envoyfilter and this will have limit service's clusterIP as cluster(rate_limit_cluster)'s endpoint. (I think ratelimit cluster should have pod's address for endpoint)

  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: GATEWAY
      listener:
        filterChain:
          filter:
            name: envoy.filters.network.http_connection_manager
            subFilter:
              name: envoy.filters.http.router
    patch:
      operation: INSERT_BEFORE
      value:
        name: envoy.filters.http.ratelimit
        typed_config:
          '@type': type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
          domain: any_domain
          failure_mode_deny: false
          rate_limit_service:
            grpc_service:
              envoy_grpc:
                cluster_name: rate_limit_cluster
            transport_api_version: V3
          timeout: 5ms
  - applyTo: CLUSTER
    match:
      cluster:
        service: {LIMIT_SERVICE}.{MY_NAMESPACE}.svc.cluster.local
    patch:
      operation: ADD
      value:
        connect_timeout: 10s
        http2_protocol_options: {}
        lb_policy: ROUND_ROBIN
        load_assignment:
          cluster_name: rate_limit_cluster
          endpoints:
          - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: {LIMIT_SERVICE}.{MY_NAMESPACE}.svc.cluster.local
                    port_value: 8081
        name: rate_limit_cluster
        type: STRICT_DNS

@msonnleitner
I'll try your suggestion. But, does outbound|8081||ratelimit.default.svc.cluster.local cluster created automatically?
I also have ratelimit k8s service but there is only one cluster with STRICT_DNS type like below.
So I've add temporary virtualservice routing to create outbound|8081||{ratelimit_service}.{namespace}.svc.cluster.local cluster.

$ istioctl pc cluster {gateway_pod}.{namespace}
...
rate_limit_cluster                        -        -          -             STRICT_DNS
...

@msonnleitner
Copy link

As per the updated Istio config, it should not be necessary to add that cluster manually. Try to just delete that section. IIRC, if you define a Kubernetes Service for ratelimiting, it should be "picked up" by Istio automatically.

@KoJJang
Copy link

KoJJang commented Mar 17, 2023

@msonnleitner
I found the reason that there is no cluster for ratelimit service. I've set PILOT_FILTER_GATEWAY_CLUSTER_CONFIG: true to save memory usage and this the reason.
So I revert to PILOT_FILTER_GATEWAY_CLUSTER_CONFIG: false and check ratelimit cluster/endpoint. After then, I modify envoyfilter as you suggested(istio docs described) like below.

      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: outbound|8081||{ratelimit_service}.{namespace}.svc.cluster.local
            authority: {ratelimit_service}.{namespace}.svc.cluster.local
        transport_api_version: V3 

After then, I checked all requests are distributed to ratelimit pods evenly, but ratelimit doesn't work :-(
The rest of the configuration is the same as when using a STRICT_DNS cluster. However, with the STRICT_DNS cluster, the ratelimit behavior was correct even though the traffic was not evenly distributed.

@KoJJang
Copy link

KoJJang commented Mar 20, 2023

I think my setup didn't work correctly because I was using istio 1.13 (in istio 1.13 there was a guide to set up a STRICT_DNS cluster in envoyfilter).

I found a workaround while still using version 1.13, which is to use the ratelimit service as a headless service (clusterIP: None).
This registered the ratelimit pod as an endpoint on the STRICT_DNS cluster and load balanced it well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests