Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AioHttp server middleware #3

Merged
merged 1 commit into from
Nov 9, 2017
Merged

AioHttp server middleware #3

merged 1 commit into from
Nov 9, 2017

Conversation

terricain
Copy link
Contributor

Have added middleware for aiohttp's server.

This look ok / do you need anything more?

@haotianw465
Copy link
Contributor

haotianw465 commented Oct 31, 2017

Hi, the basic structure looks good. One question is that are you using any customized context provider in your application?

The default context used by the X-Ray recorder stores segments/subsegments in ThreadLocal. This is fine when a request/response cycle is tied to a thread. Asyncio doesn't operate on thread. It operates on asyncio.Task object.

In case of aiohttp, each handler coroutine becomes one Task in the event loop. It is necessary to have a custom context that associates a segment to a task. That way it guarantees any subsegments (downstream calls to AWS or sql queries etc) generated from the handler can be attached to the correct segment.

Please let me know if you have any extra changes in your application running on aiohttp or any findings on load testing.

@terricain
Copy link
Contributor Author

terricain commented Oct 31, 2017

One question is that are you using any customized context provider in your application?

Nope standard.

The default context used by the X-Ray recorder stores segments/subsegments in ThreadLocal. This is fine when a request/response cycle is tied to a thread. Asyncio doesn't operate on thread. It operates on asyncio.Task object.

Good spot, I totally overlooked this. The project I was working on wasn't getting much traffic. Am going to throw the app under a few load test and I have a feeling we'll need some task specific storage.

If we do indeed need some tasklocal storage, I was thinking to subclass AWSXRayRecorder to make an async version which could specify an AsyncContext, AsyncUDPEmitter (though that is pretty nonblocking anyway) in __init__, and have make end_segment asynchronous. Sound ok to you?

@haotianw465
Copy link
Contributor

haotianw465 commented Nov 1, 2017

You can swap out the default context management with your own today. The recorder class already provides this interface.

xray_recorder.configure(context=my_async_context)

I would suggest subclass the context class to make it work with Async. The recorder always stores/gets context using APIs from context class. Please let me know if the current APIs are not sufficient to make it work. Since the SDK is in beta a breaking change on context API is acceptable if it is necessary.

The other thing is that you might need to patch a few methods of asyncio since it creates new thread to execute the tasks and the context could be lost.

For the UDPEmitter the default one already use nonblocking IO by

self._socket.setblocking(0) 

so not sure what do you mean by implementing an AsyncUDPEmitter.

@terricain
Copy link
Contributor Author

You can swap out the default context management with your own today. The recorder class already provides this interface.

Ahh yeah, I was remembering an older SDK where there was a singleton and it was a fair bit of hassle. Much nicer now 😄.

Ignore me about the UDPEmitter, my brain wasn't working then.

I've added AsyncContext, which makes use of the tasklocals project as I didn't see the point in reinventing the wheel. I put the context in a different file as it imports asyncio and tasklocals which probably won't exist in python27, therefore its more of an opt in. I added tasklocals to the extras_require in setup.py so doing a pip install aws-xray-sdk[async] would pull it in. Might be worth to add warnings to async_context.py to remind people to install it? the import error should be enough though.

I also updated the test case to include a test which would perform concurrent requests, and then ensures the segment IDs are all different. (Though if the task local storage didn't work, it would bomb out anyway)

If this looks ok to you, I think the only thing left with this is to add a bit of documentation/examples.

Copy link
Contributor

@haotianw465 haotianw465 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some docs to show usage of the aiohttp middleware would be helpful. And you don't have to make the aiohttp an opt-in since the only async syntax is only used on async_context and aiohttp middleware. A python2.7 program won't import these modules.

setup.py Outdated
@@ -35,6 +35,7 @@
],

install_requires=['jsonpickle', 'wrapt', 'requests'],
extras_require={'async': ['tasklocals']},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide more information about tasklocals? Its GitHub repo doesn't have a license and it seems like a dead project for almost 4 years so I don't think it is proper to use it as a dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah fair point. I mean its not that hard to re-implement I was just thinking there wasn't a point in reinventing the wheel. In this case I think I'll just implement task local storage directly into async_context.py

tox.ini Outdated
@@ -11,6 +11,7 @@ deps =
coverage
botocore
requests
aiohttp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test running $tox when you have python2.7 or 3.4 installed? The general development workflow for non-language-version-specific features is that contributors should be able to just run $tox against all version of interpreter installed so they know the new changes don't break for any specific version. The aiohttp test would be better be broken into a new command to run with only py3.5 and later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh no I just ran the py3.5 test manually, yeah makes sense as it would fail on anything less than 3.5

@terricain
Copy link
Contributor Author

"Should" be good.

loop = asyncio.get_event_loop()
# Use X-Ray SDK middleware
app = web.Application(middlewares=[middleware])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strongly recommended to put the X-Ray middleware at first. This has two benefits. One is that other middleware down the chain could throw 403 or other errors and this can be captured if the incoming request hit X-Ray middleware first. The second one is if other middleware tries to generate any subsegments, the segment must be present. It's better to demonstrate the order and its importance in the example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you rather I added a second middleware in the example or just a comment emphasising the importance of that? (as well as adding to the docs)

@haotianw465
Copy link
Contributor

I probably should mention this earlier. Could you do a git rebase on your local branch to merge all your local commits into a single commit? It's easier this way to do code review. More importantly once approved I can directly merge it to master branch. This way we will only have one commit on top of the master and this commit is solely for aiohttp support. It is convenient for people to track changes and maintainer to do releases.

@haotianw465
Copy link
Contributor

The code all look good. One thing I want to point out is that this aiohttp support doesn't work with tracing subtasks. e.g: You cannot trace any extra async functions in your request handler. Here is the reason: Say the server creates task A for an incoming request from user A, then in your request handler there is another async function, task A will create task B for this function execution. The SDK cannot record this async function because when the SDK creates a subsegment for this function, the SDK doesn't know this subsegment's parent since the context for task B is empty.

In order to get nested async working you need to monkey patch asyncio.create_task so that it propagates context from current task to the newly created task.

I just want to make sure you are aware of this issue and it could still cause you problems if later on you add some async functions in your request handler and try to trace these functions. If your PR is not intended to support this advanced use case then some reasoning and limitation in the documentation should be sufficient.

@terricain
Copy link
Contributor Author

Yeah, I noticed that. I think once this is all good I can easily throw in another PR to get subsegments to work, as least then we have basic functionality working. I'll update the docs reflecting this for now and then rebase

@terricain
Copy link
Contributor Author

Ahh, I will fix the subsegments issue, after looking at the create_task code, we can just make a task_factory function, and run set_task_factory() on the event loop, no need to monkey patch 😄

@terricain
Copy link
Contributor Author

Ok have rebased. Sorry mixed in there is some fixes for flake8 (mostly just line > 120)

Copy link
Contributor

@haotianw465 haotianw465 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few comments on task_factory would be better.

Added AsyncContext

Replaced task local storage with simpler version

Added task_factory

Documented task_factory
@terricain
Copy link
Contributor Author

Done 😄

@haotianw465 haotianw465 merged commit 63b5ea2 into aws:master Nov 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants