Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Add Round Robin Support for Agent Server Deployments #5719

Open
2 tasks done
Future-Outlier opened this issue Sep 3, 2024 · 3 comments
Open
2 tasks done
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Future-Outlier
Copy link
Member

Motivation: Why do you think this is important?

It can improve performance when we have more than 1 agent deployments and support same task types.

image

Goal: What should the final outcome look like, ideally?

For example, 2 endpoint (localhost:8000 and localhost:8001)
task type a 1st time send request to localhost:8000, and second time send request to localhost:8001.

Describe alternatives you've considered

Stay the same behavior.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@Future-Outlier Future-Outlier added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Sep 3, 2024
Copy link

runllm bot commented Sep 3, 2024

Flyte already supports round-robin load balancing for agent server deployments. According to the Agent Metadata Servicer PR, you can configure multiple agents with round-robin load balancing by specifying the defaultServiceConfig in the agent configuration.

Example configuration:

plugins:
  agent-service:
    defaultAgent:
      endpoint: "dns:///localhost:8000"
      insecure: true
      timeouts:
        GetTask: 100s
      defaultTimeout: 100s
    agents:
      custom_agent:
        endpoint: "dns:///localhost:8001"
        insecure: true
        defaultServiceConfig: '{"loadBalancingConfig": [{"round_robin":{}}]}'
        timeouts:
          DoTask: 300s
          GetTask: 100s
        defaultTimeout: 300s
    agentForTaskTypes:
      - custom_task: custom_agent 
      - api_task: custom_agent

Would you like more details on how to set this up?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

@RRap0so
Copy link
Contributor

RRap0so commented Sep 3, 2024

Hey @Future-Outlier. I'm not entirely sure if I'm following this. If one endpoint points to a DNS which solves to multiple Ips (if SRV even ports and everything) , won't they be loadbalacing between them?

Ideally I would like to not even think about the port or at least not have a LB based on port since multiple instances of my agent will be all the same port.

@eapolinario eapolinario added good first issue Good for newcomers help wanted Extra attention is needed and removed untriaged This issues has not yet been looked at by the Maintainers labels Sep 5, 2024
@kumare3
Copy link
Contributor

kumare3 commented Sep 11, 2024

I agree with @RRap0so I do not think we should implement a custom loadbalancer. this is a load balancers job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants