Add Tracing for SQLAlchemy and Flask-SQLAlcemy #14

therealryanbonham · 2017-12-10T04:36:28Z

Traces calls from SQLAlchemy ORM query and session classes, and calls from Flask-SQLAlchemy.
Also patches tox.ini so that py36 is used for coverage run, as if your default python was 27 it errors.

Added a couple of basic unit test for both sqlalchemy and flask-sqlalcemy tracing. Updated README as well.

Welcome feedback, this is my first attempt to extend X-Ray Tracing.

…ment

haotianw465 · 2017-12-11T20:03:06Z

README.md

@@ -175,7 +175,61 @@ app.router.add_get("/", handler)

 web.run_app(app)
 ```
+**Add Flask middleware**


Is this a duplicate?

Yes. Removing.

haotianw465 · 2017-12-11T20:05:53Z

README.md


+xray_recorder.end_segment()
+app = Flask(__name__)


I think for a general SQLAlchemy guide there is no need to involve the web framework.

You have to initialize the app from Flask in order to Initialize Flask-SQLAlchemy

db = XRayFlaskSqlAlchemy(app)

was just trying to show complete example..

Sounds reasonable. I'm fine with this.

haotianw465 · 2017-12-11T20:07:46Z

README.md

+
+```python
+from sqlalchemy.ext.declarative import declarative_base
+from sqlalchemy import *


Any reason you use wild import?

No, in fact that entire import can go. I refactored the way i was doing imports and that is not needed.

haotianw465 · 2017-12-11T20:08:42Z

README.md

+from aws_xray_sdk.core.context import Context
+from aws_xray_sdk.ext.sqlalchemy.query import XRaySessionMaker
+
+xray_recorder.configure(service='test', sampling=False, context=Context())


For quick start guide for a SQL library you don't need to configure anything on the recorder. And you could also remove the import on context as well.

haotianw465 · 2017-12-11T20:11:49Z

aws_xray_sdk/ext/flask_sqlalchemy/query.py

+from flask_sqlalchemy import SQLAlchemy, BaseQuery, _SessionSignalEvents, get_state
+from aws_xray_sdk.ext.sqlalchemy.query import XRaySession, XRayQuery
+
+def decorate_all_functions(function_decorator):


Shouldn't this file be in the ext/flask folder? aws_xray_sdk/ext/flask/sqlalchemy.py for example

Flask-SQLAlchemy is a separate python package then Flask, which is why i kept them separated. Willing to change if needed.

I think we can leave it for now.

haotianw465 · 2017-12-11T20:21:31Z

aws_xray_sdk/ext/flask_sqlalchemy/query.py

@@ -0,0 +1,95 @@
+from __future__ import print_function


unused import? Print statement should not be used.

Done. Was used for debugging.

haotianw465 · 2017-12-11T20:54:05Z

aws_xray_sdk/ext/sqlalchemy/query.py

+from functools import wraps
+
+
+def decorate_all_functions(function_decorator):


Maybe you can create a sqlalchemy util module that holds these shared functions so that you don't duplicate code.
decorate_all_functions and xray_on_call can be shared in flask_sqlalchemy

Yes good point. I meant to go back and do this, and completely forgot.

haotianw465 · 2017-12-11T21:03:14Z

aws_xray_sdk/ext/sqlalchemy/query.py

+    pass
+
+@decorate_all_functions(xray_on_call)   
+class XRayQuery(Query):


Any reason you decorate functions from Query class?

Query Class called by flask-sqlalchemyt, so this gets us complete traces

haotianw465 · 2017-12-11T21:03:35Z

aws_xray_sdk/ext/sqlalchemy/query.py

@@ -0,0 +1,61 @@
+from __future__ import print_function


haotianw465 · 2017-12-11T21:03:58Z

setup.py

@@ -35,7 +35,7 @@
        'Programming Language :: Python :: 3.6',
    ],

-    install_requires=['jsonpickle', 'wrapt', 'requests'],
+    install_requires=['jsonpickle', 'wrapt', 'requests', 'future'],


Do you need future if you remove all print statement?

No future will still be needed for the super() builtin
from builtins import super

haotianw465 · 2017-12-12T21:35:09Z

Hi, could you take a look at http://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html#api-segmentdocuments-sql
When you are sending data to X-Ray back-end service, if you have invalid key under subsegment.sql, your data will not be rejected, but the data will not propagate through our system. Some of the listed keys in that linked doc might be unavailable for specific sql libraries, which is OK.

Also for the actual query recording you have to submit that change to a separate CR. We have more strict policies for reviewing code that related to query recording. And also the query is required to be sanitized.

haotianw465 · 2017-12-12T21:38:01Z

tests/ext/flask_sqlalchemy/test_query.py

+    password = db.Column(db.String(255), nullable=False)
+
+
+def _search_entity(entity, name):


This along with find_sub could also be a helper class shared between your two query_test.py. And since these two functions are very general they could be under tests/ root so any other unit test can utilize your contribution.

haotianw465 · 2017-12-12T21:39:01Z

tests/ext/flask_sqlalchemy/test_query.py

+    return None
+
+
+def find_sub(segment, name):


I would recommend rename it to find_subsegment. Abbreviate on functions is not a good practice in general if it is not absolutely needed.

…sqlalcemy to test filter() and verify params not present in sanitized_query

therealryanbonham · 2017-12-14T03:18:16Z

@haotianw465 I made the change and used the set_sql to set the sanitized SQL, and added one unit test to test sanitization. That should not me an issue with sqlalchemy orm, as it passes everything via params and those are not included in the sql output i exposed. I will go ahead and comment out the set_sql and test for that, and then make second PR for that work once this first batch is accepted.

haotianw465 · 2017-12-14T19:44:18Z

Hi, per doc link http://docs.aws.amazon.com/xray/latest/devguide/xray-api-segmentdocuments.html#api-segmentdocuments-sql
This is the sample sql section

  "sql" : {
    "url": "jdbc:postgresql://aawijb5u25wdoy.cpamxznpdoq8.us-west-2.rds.amazonaws.com:5432/ebdb",
    "preparation": "statement",
    "database_type": "PostgreSQL",
    "database_version": "9.5.4",
    "driver_version": "PostgreSQL 9.4.1211.jre7",
    "user" : "dbuser",
    "sanitized_query" : "SELECT  *  FROM  customers  WHERE  customer_id=?;"
  }

Could you try to get some other info and add them to the sql section as well? For an architecture that has multiple databases the info like database_type, user, url could be very helpful for customers.

For sql subsegment name we have convention of using databas@host just like the one in the sample showed "name": "[email protected]". This makes sure when you have multiple databases the back-end won't aggregate the latency altogether.

therealryanbonham · 2017-12-16T21:10:31Z

I have a separate branch started to get the set_sql working. I have it adding: url, user, database_type and sanitized_query.

https://github.com/therealryanbonham/aws-xray-sdk-python/tree/set_sql

One issue i see with the way this tracing I implemented works is it is tracing each function in sqlalchemy and thus setting a local namespace on the subsegments, so it is not creating a remote segment..

Thoughts? Want that as a seperate PR?

haotianw465 · 2017-12-18T20:27:37Z

trace = xray_recorder.begin_subsegment(class_name+'.'+func.__name__) could you change the variable name from trace to subsegments? Trace is a collection of segments and subsegments. Then you can do subsegment.namespace = 'remote'. Database subsegment should always be remote.

For setting sql metadata could you have url user database_type included in this PR and a separated PR only for sanitized_query?

therealryanbonham · 2017-12-22T20:26:30Z

The combination or using the class and function name as the subsegment name, and namespace of remote, makes a very noisy service map, as every remote call shows up as it's own object. It seems remote namespace wants the name of the subsegment to be something like the hostname you are calling.

Something like
xray_recorder.begin_subsegment(sql['url'])
instead of
xray_recorder.begin_subsegment(class_name+'.'+func.name)

The issue in that though is the trace become just a bunch of repeat named subsegment calls.. It seems like remote namespaces need a way to be named, separate from the subsegment name... so that multiple sql functions can be traced as subsegments to a single remote namespace..

Something like

xray_recorder.begin_subsegment(class_name+'.'+func.name, namespace=remote, namespace_name=sql['url'])

Is that supported in any way? I have tried to figure out some solution to this but am coming up blank...

haotianw465 · 2017-12-26T20:13:22Z

Hi, as I pointed out in the previous thread:
For sql subsegment name we have convention of using databas@host just like the one in the sample showed "name": "[email protected]". This provides proper level of aggregation on service graph.

Using url only will result in all databases on that instance aggregated to one node, and looks like the naming of your current PR result in too many nodes on service graph.

The mentioned naming convention works well on all of our SDKs so far and we haven't heard any complain. But it really depends on if the patching code can retrieve the info about database name and endpoint. In this case looks like you can have all the info you need to construct database@endpoint.

haotianw465 · 2018-02-16T00:37:11Z

Hi, the PR is very close to a ready state. Do you have any update on SQL subsegment naming? Or do you prefer us to take whatever you have right now and finish the work by ourselves?

…n key/value

therealryanbonham · 2018-02-16T22:54:25Z

@haotianw465 I think this is ready for review again. It has SQL URL, user, and datatype added to the sql metadata. The sanatized_query is commented out for this PR. Traces are named by URL and have the function name as an annotation with the key "sqlalchemy".

I added another set of helper function to util.py to search for subsegments based on annotation key/value and updated unit test to use this as well.

haotianw465 · 2018-02-19T20:11:40Z

aws_xray_sdk/ext/sqlalchemy/util/decerators.py

+        if u.password is None:
+            safe_url = u.geturl()
+        else:
+            # String password from URL


Do you mean strip?

haotianw465 · 2018-02-19T20:16:52Z

aws_xray_sdk/ext/sqlalchemy/util/decerators.py

+        if sql is not None:
+            if getattr(c._local, 'entities', None) is not None:
+                subsegment = xray_recorder.begin_subsegment(sql['url'], namespace='remote')
+                # subsegment = xray_recorder.begin_subsegment(class_name+'.'+func.__name__, )


Nitpick. Probably should be removed?

haotianw465 · 2018-02-19T20:21:53Z

tests/ext/sqlalchemy/test_query.py

+    """ Test calling all() on get all records.
+    Verify we run the query and return the SQL as metdata"""
+    # with capsys.disabled():
+    with capsys.disabled():


Duplicate code on line63 and 64?

greengiant added 9 commits December 6, 2017 23:43

Initial checkin of Query and BaseQuery overrides

6d6c99d

Fix ext name

8c613d0

Fix import

1c75aa6

Add support for SQLAlchemy.orm and Flask-SQLAlchemy

0729f16

Remove print() statement

31070bd

Attempt to fix handling of Flask not having a request with a xray seg…

8f8468f

…ment

Fix handling of missing segment

b9a00b8

Fix test and add docstrings

d576942

Fix bug with End segment

ea672bb

haotianw465 suggested changes Dec 11, 2017

View reviewed changes

Code Review Cleanup. Files now all pass flake8 tests

c332d3a

haotianw465 suggested changes Dec 12, 2017

View reviewed changes

greengiant added 3 commits December 13, 2017 03:35

Move find_subsegment and _search_entity functions to tests/util.py

941097a

Uset set_sql to corectly test the sanitized_query value. Add test to …

730bef0

…sqlalcemy to test filter() and verify params not present in sanitized_query

Merge remote-tracking branch 'upstream/master' into FlaskSQLAlchemy

21195c1

Comment out set_sql for sanitized_query for seperate code review

4c26818

greengiant added 3 commits December 16, 2017 03:08

Starting to add in set_sql

c3646f1

Add more SQL info to trace

a9f1b0c

Correct URL handling for connection strings

d494d15

haotianw465 added the enhancement label Jan 12, 2018

greengiant added 2 commits February 16, 2018 14:25

Merge remote-tracking branch 'upstream/master' into set_sql

e3f171c

Bug fix and remove sanitized_query

84c5ae0

Fix unit test and add helper util for finding subsegment by annotatio…

bf8d99a

…n key/value

haotianw465 reviewed Feb 19, 2018

View reviewed changes

Minor cleanups

087dd1f

haotianw465 approved these changes Feb 20, 2018

View reviewed changes

haotianw465 merged commit d110386 into aws:master Feb 20, 2018

haotianw465 mentioned this pull request Mar 6, 2018

What information does Flask-SQLAlchemy send? #33

Closed

chanchiem mentioned this pull request Jan 28, 2019

Patching sqlalchemy #124

Closed

		from functools import wraps


		def decorate_all_functions(function_decorator):

		password = db.Column(db.String(255), nullable=False)


		def _search_entity(entity, name):

Add Tracing for SQLAlchemy and Flask-SQLAlcemy #14

Add Tracing for SQLAlchemy and Flask-SQLAlcemy #14

Conversation

therealryanbonham commented Dec 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

haotianw465 commented Dec 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

therealryanbonham commented Dec 14, 2017

haotianw465 commented Dec 14, 2017

therealryanbonham commented Dec 16, 2017 • edited Loading

haotianw465 commented Dec 18, 2017

therealryanbonham commented Dec 22, 2017

haotianw465 commented Dec 26, 2017

haotianw465 commented Feb 16, 2018

therealryanbonham commented Feb 16, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

therealryanbonham commented Dec 16, 2017 •

edited

Loading