-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: multiple optimizations for gunicorn and expose pg's CONN_MAX_AGE #1424
Conversation
WalkthroughThe pull request introduces configuration changes for database connection management in a Django application. Specifically, it updates the PostgreSQL database settings to include a configurable connection maximum age and modifies the backend startup script to adjust database readiness check timing and Gunicorn server parameters. These changes aim to optimize database connection handling and server performance. Changes
Sequence DiagramsequenceDiagram
participant Env as Environment
participant Settings as Django Settings
participant DB as Database
participant Gunicorn as Gunicorn Server
Env->>Settings: Set CONN_MAX_AGE
Settings->>DB: Configure connection pool
Env->>Gunicorn: Configure server parameters
Gunicorn->>DB: Establish connections
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/ciso_assistant/settings.py (1)
349-349
: Consider adding upper bound validation for CONN_MAX_AGE.While the default of 300 seconds is reasonable, consider adding validation to prevent extremely high values that could lead to connection pool exhaustion.
- "CONN_MAX_AGE": os.environ.get("CONN_MAX_AGE", 300), + "CONN_MAX_AGE": min(int(os.environ.get("CONN_MAX_AGE", 300)), 3600),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/ciso_assistant/settings.py
(1 hunks)backend/startup.sh
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: startup-functional-test (3.12)
- GitHub Check: startup-docker-compose-test
- GitHub Check: enterprise-startup-functional-test (3.12)
- GitHub Check: functional-tests (3.12, chromium)
- GitHub Check: enterprise-startup-docker-compose-test
- GitHub Check: test (3.12)
- GitHub Check: build (3.12)
- GitHub Check: ruff (3.12)
- GitHub Check: enterprise-functional-tests (3.12, chromium)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
backend/ciso_assistant/settings.py (1)
349-349
: Verify connection pool settings with the database configuration.The persistent connections (CONN_MAX_AGE=300) combined with 3 Gunicorn workers could lead to connection pool exhaustion. Please ensure your PostgreSQL max_connections setting is properly configured.
✅ Verification successful
Database connection pool configuration appears reasonable
The current setup with 3 Gunicorn workers and 5-minute connection persistence is a standard configuration that shouldn't cause connection pool exhaustion. The expected connection count (5-9 connections) is well within typical PostgreSQL defaults. However, monitor connection usage in production to ensure optimal performance.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check PostgreSQL max_connections setting if accessible if [ -n "$DB_HOST" ]; then PGPASSWORD=$POSTGRES_PASSWORD psql -h $DB_HOST -U $POSTGRES_USER -d $POSTGRES_NAME -c "SHOW max_connections;" fiLength of output: 15
Script:
#!/bin/bash # Search for Gunicorn configuration and worker settings echo "=== Searching for Gunicorn configuration ===" rg -l "gunicorn" --type yaml --type conf rg -l "workers" --type yaml --type conf echo -e "\n=== Checking for connection pooling settings ===" rg -l "connection.*pool" --type python rg -B2 -A2 "CONN_MAX_AGE" --type python echo -e "\n=== Looking for database middleware ===" rg -l "DATABASE.*MIDDLEWARE" --type python rg -l "db.*middleware" --type pythonLength of output: 734
Script:
#!/bin/bash # Search for configuration files and settings echo "=== Searching for Gunicorn configuration files ===" fd -e yml -e yaml -e ini -e conf -e config echo -e "\n=== Checking for worker and connection settings ===" rg "workers\s*=" rg "bind\s*=" rg "CONN_MAX_AGE" echo -e "\n=== Looking for database configuration ===" rg -A3 "DATABASES\s*="Length of output: 10501
backend/startup.sh (2)
28-28
: Verify impact of reduced timeout on existing long-running requests.The timeout reduction from 300s to 100s might affect existing long-running operations. Please ensure all endpoints complete within this new limit.
✅ Verification successful
The timeout reduction to 100s appears safe
The codebase analysis shows no evidence of operations that would require more than 100 seconds:
- All database operations are standard CRUD and reporting queries
- No bulk operations or heavy data processing
- No long-running external API integrations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for potential long-running operations in views and tasks echo "Searching for potential long-running operations..." rg -n "sleep|time.sleep|asyncio.sleep" --type py # Look for existing timeout configurations in views echo "Checking for existing timeout configurations..." rg -n "timeout|TIMEOUT" --type pyLength of output: 1271
Script:
#!/bin/bash # Search for potential database bulk operations echo "Checking for bulk database operations..." rg -n "bulk_create|bulk_update|bulk_delete" --type py # Search for potentially complex queries echo "Checking for complex queries..." rg -n "annotate|aggregate|raw|extra|values|select_related|prefetch_related" --type py # Look for external API calls echo "Checking for external API calls..." rg -n "requests\.|urllib|http|client\.post|client\.get" --type pyLength of output: 30219
28-28
: Review Gunicorn worker configuration for production readiness.The current configuration might need adjustment:
- Worker count (3) might be too low for production loads
- Timeout (100s) might be insufficient for long-running requests
- Keep-alive (30s) aligns well with CONN_MAX_AGE but might need tuning based on load
Consider using the following formula for workers:
(2 * CPU_CORES) + 1
@@ -16,7 +16,7 @@ fi | |||
|
|||
while ! python manage.py showmigrations iam >/dev/null; do | |||
echo "database not ready; waiting" | |||
sleep 20 | |||
sleep 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add maximum retry limit to database readiness check.
The current implementation might lead to infinite retries. Consider adding a maximum retry count with proper error handling.
+ MAX_RETRIES=20
+ RETRY_COUNT=0
while ! python manage.py showmigrations iam >/dev/null; do
+ if [ $RETRY_COUNT -ge $MAX_RETRIES ]; then
+ echo "Failed to connect to database after $MAX_RETRIES attempts"
+ exit 1
+ fi
echo "database not ready; waiting"
sleep 15
+ RETRY_COUNT=$((RETRY_COUNT + 1))
done
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Note to self: keep-alive needs to be aligned with caddy and/or the extra proxy/LB
Summary by CodeRabbit