-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support running superset via pex #1713
Changes from all commits
a5c35a5
61f077e
ac0a94c
4efc6f8
19fb0af
2876df9
4c57caf
31c66a7
4eaeac4
b192578
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,82 @@ from __future__ import division | |
from __future__ import print_function | ||
from __future__ import unicode_literals | ||
|
||
from superset.cli import manager | ||
import argparse | ||
import sys | ||
|
||
import gunicorn.app.base | ||
|
||
|
||
class GunicornSupersetApplication(gunicorn.app.base.BaseApplication): | ||
def __init__(self, address, port, workers, timeout): | ||
self.options = { | ||
'bind': '%s:%s' % (address, port), | ||
'workers': workers, | ||
'timeout': timeout, | ||
'limit-request-line': 0, | ||
'limit-request-field_size': 0 | ||
} | ||
|
||
super(GunicornSupersetApplication, self).__init__() | ||
|
||
def load_config(self): | ||
config = dict([(key, value) for key, value in self.options.iteritems() | ||
if key in self.cfg.settings and value is not None]) | ||
for key, value in config.iteritems(): | ||
self.cfg.set(key.lower(), value) | ||
|
||
def load(self): | ||
from superset import app | ||
|
||
return app | ||
|
||
|
||
def run_server(): | ||
parser = argparse.ArgumentParser(description='Run gunicorn for superset') | ||
subparsers = parser.add_subparsers() | ||
gunicorn_parser = subparsers.add_parser('runserver') | ||
|
||
gunicorn_parser.add_argument( | ||
'-d', '--debug', action='store_true', | ||
help='Start the web server in debug mode') | ||
gunicorn_parser.add_argument( | ||
'-a', '--address', type=str, default='127.0.0.1', | ||
help='Specify the address to which to bind the web server') | ||
gunicorn_parser.add_argument( | ||
'-p', '--port', type=int, default=8088, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This pull request has changed the default port number from Is this intentional? These settings are still mentioned in the docs. +cc @szmate1618 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hope it's not :) Care to open a PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The solution is not easy it seems. The idea here appears to be to avoid importing I think it may be fine to retire these settings. Gunicorn can be configured via a config file. Superset could have a separate Gunicorn config file instead that would allow setting these web-server-specific parameters in one logical place. Plus it would give the user a chance to set many other Gunicorn settings that cannot be controlled now, such as the SSL certificate. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i don't know, i kinda liked the one config file to rule them all approach. Let's ping the airbnb guys @yolken @mistercrunch BTW could we please move the discussion to #1939 ? |
||
help='Specify the port on which to run the web server') | ||
gunicorn_parser.add_argument( | ||
'-w', '--workers', type=int, default=4, | ||
help='Number of gunicorn web server workers to fire up') | ||
gunicorn_parser.add_argument( | ||
'-t', '--timeout', type=int, default=30, | ||
help='Specify the timeout (seconds) for the gunicorn web server') | ||
|
||
args = parser.parse_args() | ||
|
||
if args.debug: | ||
from superset import app | ||
|
||
app.run( | ||
host='0.0.0.0', | ||
port=int(args.port), | ||
threaded=True, | ||
debug=True) | ||
else: | ||
gunicorn_app_obj = GunicornSupersetApplication( | ||
args.address, args.port, args.workers, args.timeout) | ||
gunicorn_app_obj.run() | ||
|
||
|
||
if __name__ == "__main__": | ||
manager.run() | ||
if len(sys.argv) > 1 and sys.argv[1] == 'runserver': | ||
# In the runserver case, don't go through the manager so that superset | ||
# import is deferred until the app is loaded; this allows for the app to be run via pex | ||
# and cleanly forked in the gunicorn case. | ||
# | ||
# TODO: Refactor cli so that gunicorn can be started without first importing superset; | ||
# this will allow us to move the runserver logic back into cli module. | ||
run_server() | ||
else: | ||
from superset.cli import manager | ||
manager.run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the cli logic should live in
cli.py
as we're trying to keep this file to a bare minimum since it can't be imported easily (no.py
extension). You should also useflask_script
the same way that the rest of the CLI is written as opposed to good old argparseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that but got lots of DB errors due to forking problems. :-( The issue is that, with the way things are set up now, you can't import
superset
or any of its subpackages before gunicorn forks (i.e., in theload
function). The underlying cause is that sqlalchemy (initialized insuperset/__init__.py
) is not fork-safe.We ran into similar issues when pex-ifying knowledge repo. The way around it is to refactor the code so that
cli.py
can be loaded without initializing the flask app and its associated DB connections in thegunicorn
case. Looking at the code, this is doable, but would require a significant amount of refactoring.Given these limitations, can we keep the gunicorn wrapper in the
superset
script for now and leave the refactoring as a future TODO? Apologies that it isn't as clean as it would be ideally.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry I now realize I skipped reading the PR's message which explained all this. We'll need to update our chef recipe when merging this into production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! The old version of
runserver
will still work until we deprecate it, so nothing should break immediately.