Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Superset scheduling issues #9906

Closed
thammaneni opened this issue May 26, 2020 · 2 comments
Closed

Superset scheduling issues #9906

thammaneni opened this issue May 26, 2020 · 2 comments
Labels
!deprecated-label:bug Deprecated label - Use #bug instead inactive Inactive for >= 30 days

Comments

@thammaneni
Copy link

thammaneni commented May 26, 2020

A clear and concise description of what the bug is.
Hi Team,

This is regarding Apache superset scheduling functionality working. I would like to understand more about the celery - redis configuration for chart email schedules module. I have switched to latest version 0.36.0. I can confidently say it is functioning better than earlier version with respect to scheduling performance.

But, . I have been getting the following exceptions for a long time. I have been raised tickets through the superset git-hub as well. I just want to make a conversion one more level through this email. Requesting you to provide your understandings/thoughts/suggestions.

Test case:
I have created a 25+ schedule jobs with 30mins interval (*/30 * * * *) and the reports which are having with max size of 5k rows and 25 columns. I have monitored the schedule jobs continuously 10 to 24hrs. The following screenshot will have scheduled delivery status at each interval for your reference.
Note: The red colored ones are non-delivered emails for that slot.

Screenshot:
image

The following are the major exceptions the we are encountering continuously.

S.No Exception Type
1 NoSuchColumnError("Could not locate column in row for column 'slice_email_schedules.id'")
2 ResourceClosedError('This result object does not return rows. It has been closed automatically.')
3 DatabaseError('(psycopg2.DatabaseError) error with status PGRES_TUPLES_OK and no message from the libpq')

Celery configuration: we are using redisdb for broker url as “redis://localhost:6379/0” and celery results backend as “redis://localhost:6379/1”.

class CeleryConfig: # pylint: disable=too-few-public-methods
#BROKER_URL = "sqla+sqlite:///celerydb.sqlite"
if 'BROKER_URL' in os.environ:
BROKER_URL = os.environ['BROKER_URL']

CELERY_IMPORTS = ("superset.sql_lab", "superset.tasks")

#CELERY_RESULT_BACKEND = "db+sqlite:///celery_results.sqlite"
if 'CELERY_RESULT_BACKEND' in os.environ:
    CELERY_RESULT_BACKEND = os.environ['CELERY_RESULT_BACKEND']

#CELERYD_LOG_LEVEL = "DEBUG"
if 'LOG_LEVEL' in os.environ:
    CELERYD_LOG_LEVEL = os.environ['LOG_LEVEL']
else:
    CELERYD_LOG_LEVEL = "INFO"

CELERYD_PREFETCH_MULTIPLIER = 1

CELERY_ACKS_LATE = True

CELERY_ANNOTATIONS = {
    "sql_lab.get_sql_results": {"rate_limit": "100/s"},
    "email_reports.send": {
        "rate_limit": "1/s",
        "time_limit": 300,
        "soft_time_limit": 350,
        "ignore_result": True,
    },
    
}
CELERYBEAT_SCHEDULE = {
    "email_reports.schedule_hourly": {
        "task": "email_reports.schedule_hourly",
        "schedule": crontab(minute=1, hour="*"),
    },
    
}

We are using following celery worker and celery beat commands to initiate schedules.

celery worker --app=superset.tasks.celery_app:app --loglevel=${LOG_LEVEL:-error} --soft-time-limit 400 --time-limit 500 --autoscale=20,6 --pool=prefork -Ofair -c 6

celery beat --app=superset.tasks.celery_app:app

PIP versions:
Python3.7
celery==4.4.2
kombu==4.6.8
psycopg2==2.8.5
redis==3.5.0
postgresql server version 9.2

I hope the above information will help you to understand about my schedule configuration, if any further details required please reply to this email. Please help in order to correct anything here or version upgradations.

Thanks for putting your valuable time.

Best Regards,
Srini T.

Expected results

what you expected to happen.

Actual results

what actually happens.

Screenshots

If applicable, add screenshots to help explain your problem.

How to reproduce the bug

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Environment

(please complete the following information):

PIP versions:
Python3.7
celery==4.4.2
kombu==4.6.8
psycopg2==2.8.5
redis==3.5.0
postgresql server version 9.2

Checklist

Make sure these boxes are checked before submitting your issue - thank you!

  • [ Yes] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [Yes ] I have reproduced the issue with at least the latest released version of superset.
  • [Yes ] I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Add any other context about the problem here.

@thammaneni thammaneni added the !deprecated-label:bug Deprecated label - Use #bug instead label May 26, 2020
@stale
Copy link

stale bot commented Jul 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Jul 25, 2020
@stale stale bot closed this as completed Aug 1, 2020
@robdiciuccio
Copy link
Member

Likely related: #13350

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
!deprecated-label:bug Deprecated label - Use #bug instead inactive Inactive for >= 30 days
Projects
None yet
Development

No branches or pull requests

2 participants