Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): Improve pyro reliability by adding connection timeout, retry, cleanup, and dynamic connection thread size #8574

Merged
merged 6 commits into from
Nov 7, 2024

Conversation

majdyz
Copy link
Contributor

@majdyz majdyz commented Nov 6, 2024

Background

To avoid the same issue happening with database we need to do some prevention on managing the lifecycle of the pyro connection.

Changes 🏗️

Applying the recommendation here https://pyro5.readthedocs.io/en/latest/tipstricks.html#release-proxies-when-no-longer-used-avoids-after-x-simultaneous-proxy-connections-pyro-seems-to-freeze, e.g:

  • Add pyro connection release timeout
  • Add pyro connection retry/auto-reconnect
  • Adjust the connection thread pool cap to match the maximum possible number of execution threads.
  • Add manual cleanup for pyro connection on thread/process termination on the block executor.

Testing 🔍

Note

Only for the new autogpt platform, currently in autogpt_platform/

  • Create from scratch and execute an agent with at least 3 blocks
  • Import an agent from file upload, and confirm it executes correctly
  • Upload agent to marketplace
  • Import an agent from marketplace and confirm it executes correctly
  • Edit an agent from monitor, and confirm it executes correctly

Configuration Changes 📝

Note

Only for the new autogpt platform, currently in autogpt_platform/

If you're making configuration or infrastructure changes, please remember to check you've updated the related infrastructure code in the autogpt_platform/infra folder.

Examples of such changes might include:

  • Changing ports
  • Adding new services that need to communicate with each other
  • Secrets or environment variable changes
  • New or infrastructure changes such as databases

… retry, cleanup, and dynamic connection thread size
@majdyz majdyz requested a review from a team as a code owner November 6, 2024 12:15
@majdyz majdyz requested review from Swiftyos and removed request for a team November 6, 2024 12:15
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Potential Resource Leak
The close_service_client function only handles PyroClient instances. There might be a need to handle cleanup for other types of clients to prevent resource leaks.

Configuration Validation
The new Pyro configuration settings are directly assigned without validation. Consider adding checks to ensure the values are within acceptable ranges.

Error Handling
The calculation of maximum_connection_thread_count doesn't include error handling for potential division by zero or negative values.

ntindle
ntindle previously approved these changes Nov 6, 2024
autogpt_platform/backend/backend/util/settings.py Outdated Show resolved Hide resolved
@majdyz majdyz requested a review from a team as a code owner November 7, 2024 04:23
@majdyz majdyz merged commit 91edf08 into dev Nov 7, 2024
12 checks passed
@majdyz majdyz deleted the zamilmajdy/improve-pyro-reliability branch November 7, 2024 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants