Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request aggregation #1414

Closed
peerpalo opened this issue Nov 29, 2021 · 5 comments
Closed

Request aggregation #1414

peerpalo opened this issue Nov 29, 2021 · 5 comments
Assignees
Labels
Type: Idea This issue is a high-level idea for discussion.

Comments

@peerpalo
Copy link

What should we add or change to make your life better?

Ocelot has a very useful feature called request aggregation, basically allows to send multiple GETs to different routes and aggregate them in a single object. The idea is to create a cluster with multiple destinations to same route and add an "aggregation policy" (load balancing would not be used in that case).

E.g.

{
  "ReverseProxy": {
    "Routes": {
      "route1": {
        "ClusterId": "cluster1",
        "Match": {
          "Path": "/products"
        }
      }
    },
    "Clusters": {
      "cluster1": {
        "Destinations": [
          {
            "cluster1/destination1": {
              "Address": "https://api1.com/products"
            }
          },
          {
            "cluster1/destination2": {
              "Address": "https://api2.com/products"
            }
          }

        ],
        "AggregatePolicy": "Merge"
      }
    }
  }
}

Should produce a result set like this

{
  "cluster1/destination1": [
    {
      "id": 1
    },
    {
      "id": 3
    }
  ],
  "cluster1/destination2": [
    {
      "id": 5
    }
  ]
}

Why is this important to you?

In our microservice structure we have the same service across various server around the world because of our business intelligence aggregator. We would like to have a unique api access point for specific data so we can merge and decorate our objects before give them to client.

Looking inside code, ForwarderMiddleware and HttpForwarder needs to accept multiple destinations, ProxiedDestination inside IReverseProxyFeature needs also to be renamed to ProxiedDestinations with all of the consequences of that.

@peerpalo peerpalo added the Type: Idea This issue is a high-level idea for discussion. label Nov 29, 2021
@samsp-msft
Copy link
Contributor

We are aware of most of the features that Ocelot has in this area - and made a conscious decision to not include direct support for them in YARP. A lot of the reasons is around performance - to enable this kind of functionality YARP would need to be intelligent about the contents of request/response bodies.

To me, what is unclear is how much value being able to do this kind of thing with configuration really is? The code for each of these individual operations isn't substantial, and API endpoints/routes can be mixed with YARP as part of the overall ASP.NET route table. I suspect if we try to solve these with config, that the scope of what people want to do is substantial - so where would we draw the line - the potential for scope creep is large?

What of the other capabilities of YARP would you want? is it the cluster/load balancing/health? If we had an API to select a destination from a cluster meeting the other rules - would that be sufficient?

@peerpalo
Copy link
Author

peerpalo commented Nov 30, 2021

Well, you're right, project name is "Yet Another Reverse Proxy", not "Yet Another API Gateway". To be honest, I totally agree about configuration or code question. In fact I tried to implement a custom middleware to handle aggregation and, well, it works.

https://gist.github.com/peerpalo/b83598e2c98413a6f5af92fe4800985b

But as you can see we have some problems

  1. If we put LoadBalancingMiddleware before (because we need it for normal calls) AvailableDestinations will be always reduced to one destination
  2. AggregationMiddleware is not using HttpForwarder, we got plenty of code duplication
  3. It doesn't really "fit well" in the entire pipeline (can't access to same log formats, eventids, ForwarderErrorFeature, etc etc...)

What can be improved? In my opinion:

  • Refactor HttpForwarder so it can be nicely inherited and only the effective "send" part can be enhanced (no more code duplication)
  • Allow developers to improve (or change) logic behind ForwarderMiddleware

@samsp-msft
Copy link
Contributor

samsp-msft commented Nov 30, 2021

I wasn't expecting the gateway scenarios to be written as a proxy middleware, more as an explicit endpoint using

app.MapGet("/path/to/map", MyHandler);

Task MyHandler(HttpContext context)
{
  // get addresses for outbound calls
  // make calls
  // aggregate results
}

The part from the proxy that would be interesting is how to get the addresses of the destinations for outbound calls. We have talked about having something in HttpClient that does something similar to the proxy in terms of managing a collection of destinations and balancing calls between them, handling heath checks etc. This is needed for services like gRPC where you probably have more than one destination server for a request, and want to be smart about spreading load, and handling downtime.

For your scenario - do you have other routes that the proxy will be handing that don't do the request aggregation and are just passthrough, or is this the main scenario?

I was thinking if we had something like:

async Task MyHandler(HttpContext context)
{
    var pf = context.GetReverseProxyFeature();
    var cluster = pf.Clusters["cluster1"];
    var dest = cluster.NextDestination();
    var client = cluster.GetHttpClient();

    await client.GetAsync($"https://{dest.Model.Address}/Service1");
    ...
}

So this would retrieve the next applicable destination, and an HttpClient (wrapper) based on the YARP config. Would that be the pieces that you need?

@peerpalo
Copy link
Author

peerpalo commented Dec 1, 2021

Yes, we have also other routes, in fact the idea is to have something like

https://gateway/products

that would return

{
  "destination1": [
    {
      "id": 1
    },
    {
      "id": 3
    }
  ],
  "destination2": [
    {
      "id": 5
    }
  ]
}

so client can be do calls like this

https://gateway/products/destination1/3
https://gateway/products/destination2/5

I wasn't expecting the gateway scenarios to be written as a proxy middleware

I was thinking if we had something like:

async Task MyHandler(HttpContext context)
{
    var pf = context.GetReverseProxyFeature();
    var cluster = pf.Clusters["cluster1"];
    var dest = cluster.NextDestination();
    var client = cluster.GetHttpClient();

    await client.GetAsync($"https://{dest.Model.Address}/Service1");
    ...
}

So this would retrieve the next applicable destination, and an HttpClient (wrapper) based on the YARP config. Would that be the pieces that you need?

Currently we wrote that as a middleware because it's the only way to have AvailableDestinations "cleaned" by health monitor (and other things) but yes, if we can implement a direct route handler having already all needed info (AvailableDestinations, HttpClient for current cluster, transforms, cache, etc, etc...) it would be a good solution!

@Kahbazi
Copy link
Collaborator

Kahbazi commented Dec 1, 2021

if we can implement a direct route handler having already all needed info (AvailableDestinations, HttpClient for current cluster, transforms, cache, etc, etc...) it would be a good solution!

Perhaps Similar to #1165?

@microsoft microsoft locked and limited conversation to collaborators Apr 28, 2022
@Tratcher Tratcher converted this issue into discussion #1677 Apr 28, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Type: Idea This issue is a high-level idea for discussion.
Projects
None yet
Development

No branches or pull requests

3 participants