Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add asyncio support for all components #191

Merged
merged 2 commits into from
Feb 19, 2020
Merged

Add asyncio support for all components #191

merged 2 commits into from
Feb 19, 2020

Conversation

joncinque
Copy link
Contributor

In order to minimize the copy-paste between the async and sync
implementations, all of the non-base components are sym-linked to their
sync counterparts. Since these import AlphaVantage locally, they
automatically pick up the async version, which is really nifty!

There was some refactoring in SectorPerformance to make the re-use
possible.

The async verison of AlphaVantage tries to re-use as much as possible
from the sync version. The main differences are:

  • a new session for the HTTP calls
  • a new close() function which should be called when you're done
  • proxy is a string instead of a dict

Using it:

import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()

In order to minimize the copy-paste between the async and sync
implementations, all of the non-base components are sym-linked to their
sync counterparts.  Since these import `AlphaVantage` locally, they
automatically pick up the async version, which is really nifty!

There was some refactoring in `SectorPerformance` to make the re-use
possible.

The async verison of AlphaVantage tries to re-use as much as possible
from the sync version.  The main differences are:
* a new `session` for the HTTP calls
* a new `close()` function which should be called when you're done
* `proxy` is a string instead of a dict

Using it:
```python
import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()
```
@PatrickAlphaC
Copy link
Collaborator

PatrickAlphaC commented Feb 14, 2020

This is fantastic!
For some context: http://masnun.rocks/2016/10/06/async-python-the-different-forms-of-concurrency/

I'm trying to figure out if their might be a simpler way than this? Like maybe:

import asyncio
from alpha_vantage.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY', async = True)
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()
loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()

So this may show my lack of knowledge on the package, but do you know how would someone look to run async programming currently for this package? I only know how to multithread at the moment.

What do you think?

@PatrickAlphaC
Copy link
Collaborator

Or maybe even something like:

from alpha_vantage.async_support.timeseries import TimeSeries
ts = TimeSeries(key='YOUR_API_KEY')
data, meta_data = ts.get_intraday(['GOOGL', 'AAPL', 'TSLA'], async = True)

@joncinque
Copy link
Contributor Author

The library has changed a lot over the past few iterations of Python 3, so the code provided will only work for Python 3.5+, which is when the async and await keywords were introduced. A side note on that, I didn't include any Python version checks, so it would be best to provide some documentation in the README saying that the async version is only compatible for Python 3.5 and above.
As for the example I provided, starting with python 3.7, there's a helper function which can make the example cleaner:

import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY', async = True)
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()
asyncio.run(get_data())

As for the placement of the files, I went with a model of another library, which makes it extremely clear when you're using the async version over the sync by explicitly importing from the async_support part. Reference for that library: https://github.com/ccxt/ccxt#python

Otherwise, we would need to create async versions of all the functions in the base class and name them differently, ie get_intraday_async, since async functions must be called from an async environment, which is some form of an asyncio event loop.

There's some interesting thoughts on this SO thread for duplication of code in libraries that want to support both sync and async: https://stackoverflow.com/questions/55152952/duplication-of-code-for-synchronous-and-asynchronous-implementations

To summarize, your first example would only work if the whole library were to be async, without the sync versions, and you'll want to keep supporting sync calls. Your second example wouldn't work unfortunately, because an async function must be awaited to even run.

The idea of using asyncio is to achieve really good performance all from one thread. The event loop does the job of intelligently scheduling the work between simultaneous tasks, switching whenever there's some sort of "heavy" operation, usually I/O.

For some background, I recently converted a large codebase into asyncio. The program essentially did many requests calls using 30-50 threads. After the conversion, there was a slight increase in CPU usage, but the performance went up roughly 6x! This way, if people want to fetch a lot of AlphaVantage data, instead of creating many threads, they could just schedule a bunch of get_data() calls and gather them in one place, ie:

loop.run_until_complete(asyncio.gather(get_data(), get_data(), get_data()))

and asyncio would do the job of handling concurrency.
https://docs.python.org/3/library/asyncio-task.html#running-tasks-concurrently for reference as well

Let me know how if there's any confusion or unclear parts to my answer!

@PatrickAlphaC
Copy link
Collaborator

PatrickAlphaC commented Feb 19, 2020

This is great @joncinque, I'm going to add this to the develop branch as-is and will do some testing from there.

Are you currently looking to use this for a project at the moment? Do you mind installing from develop and looking for some fine tuning of this install? I'll do some testing to see how it will affect different versions of python, and I've been meaning to upgrade the tests with github actions here anyways....

I love how you modeled it after the ccxt library, that's a project that I LOVE (and I've been hoping they add decentralized exchange support soon...)

Thank you!

After doing some more research I found out this won't work in IPython/Jupyter Notebooks, so this is my note to update the README.md with that information as well.

One more thing, I think an easier example is as follows:

import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries


async def get_data(symbol):
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday(symbol)
    await ts.close()
    return data

loop = asyncio.get_event_loop()
cat = loop.run_until_complete(get_data('AAPL'))
print(cat)
loop.close()

This way users can see the data they returned, and for multiple calls:

import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

symbols = ['AAPL', 'GOOG', 'TSLA', 'MSFT']


async def get_data(symbol):
    ts = TimeSeries()
    data, _ = await ts.get_quote_endpoint(symbol)
    await ts.close()
    return data

loop = asyncio.get_event_loop()
tasks = [get_data(symbol) for symbol in symbols]
group1 = asyncio.gather(*tasks)
results = loop.run_until_complete(group1)
loop.close()
print(results)

Let me know if you think otherwise.

@PatrickAlphaC PatrickAlphaC merged commit 20df386 into RomelTorres:develop Feb 19, 2020
@joncinque
Copy link
Contributor Author

Those changes for the README are great, and feel free to make any additional changes to work with your dependencies and environments. For Python 3.4, the uses of await will need to be to yield from and all async def functions will need the @asyncio.coroutine decorator. There's some more info about that difference at https://stackoverflow.com/questions/44251045/what-does-the-yield-from-syntax-do-in-asyncio-and-how-is-it-different-from-aw

Otherwise, I do need this for a project, but I can use a local version no problem. Let me know if you need any other help or clarifications!

@PatrickAlphaC
Copy link
Collaborator

Perfect! Let's keep it in develop for a little bit and we and the community can both play with it. As you continue your project, if you see improvements, feel free to add them here.

Otherwise, once we can figure out how to make sure the code won't break for different versions of python, we can merge it to the production/master branch.

Thanks again for this great PR.

PatrickAlphaC added a commit to PatrickAlphaC/alpha_vantage that referenced this pull request Apr 26, 2020
* Fixing pypi badge

* Fixing typo on README

* Use ==/!= to compare str, bytes, and int literals

Identity is not the same thing as equality in Python so use ==/!= to compare str, bytes, and int literals. In Python >= 3.8, these instances will raise SyntaxWarnings so it is best to fix them now. https://docs.python.org/3.8/whatsnew/3.8.html#porting-to-python-3-8

$ python
```
>>> pandas = "panda"
>>> pandas += "s"
>>> pandas == "pandas"
True
>>> pandas is "pandas"
False
```

* added rapidapi key integration

* Fixing pypi badge

* Fixing typo on README

* Use ==/!= to compare str, bytes, and int literals

Identity is not the same thing as equality in Python so use ==/!= to compare str, bytes, and int literals. In Python >= 3.8, these instances will raise SyntaxWarnings so it is best to fix them now. https://docs.python.org/3.8/whatsnew/3.8.html#porting-to-python-3-8

$ python
```
>>> pandas = "panda"
>>> pandas += "s"
>>> pandas == "pandas"
True
>>> pandas is "pandas"
False
```

* added rapidapi key integration

* prep for 2.1.3

* Removing get_batch_stock_quotes method (RomelTorres#189)

* Removing get_batch_stock_quotes method

Remove get_batch_stock_quotes method, resolving issue RomelTorres#184.

* remove tests for get_batch_stock_quotes

* Delete mock_batch_quotes

Was only used for the tests removed in the previous commits, which made this file dead weight.

* Add asyncio support for all components (RomelTorres#191)

* Add asyncio support for all components

In order to minimize the copy-paste between the async and sync
implementations, all of the non-base components are sym-linked to their
sync counterparts.  Since these import `AlphaVantage` locally, they
automatically pick up the async version, which is really nifty!

There was some refactoring in `SectorPerformance` to make the re-use
possible.

The async verison of AlphaVantage tries to re-use as much as possible
from the sync version.  The main differences are:
* a new `session` for the HTTP calls
* a new `close()` function which should be called when you're done
* `proxy` is a string instead of a dict

Using it:
```python
import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()
```

* Add asyncio packages to setup.py

* Issue RomelTorres#206: Add 'time_period' argument to 'get_bbands()' documentation. (RomelTorres#207)

* fixed fx documentation

* fixed pypi badge for documentation

* Fixes small documentation bugs (RomelTorres#208)

* fixed fx documentation

* fixed pypi badge for documentation

* prep for 2.2.0

* small documentaiton change for 2.2.0

Co-authored-by: Igor Tavares <[email protected]>
Co-authored-by: Christian Clauss <[email protected]>
Co-authored-by: Aaron Sanders <[email protected]>
Co-authored-by: Jon Cinque <[email protected]>
Co-authored-by: Peter Anderson <[email protected]>
PatrickAlphaC added a commit that referenced this pull request Apr 26, 2020
* added rapidapi key integration

* prep for 2.1.3

* Removing get_batch_stock_quotes method (#189)

* Removing get_batch_stock_quotes method

Remove get_batch_stock_quotes method, resolving issue #184.

* remove tests for get_batch_stock_quotes

* Delete mock_batch_quotes

Was only used for the tests removed in the previous commits, which made this file dead weight.

* Add asyncio support for all components (#191)

* Add asyncio support for all components

In order to minimize the copy-paste between the async and sync
implementations, all of the non-base components are sym-linked to their
sync counterparts.  Since these import `AlphaVantage` locally, they
automatically pick up the async version, which is really nifty!

There was some refactoring in `SectorPerformance` to make the re-use
possible.

The async verison of AlphaVantage tries to re-use as much as possible
from the sync version.  The main differences are:
* a new `session` for the HTTP calls
* a new `close()` function which should be called when you're done
* `proxy` is a string instead of a dict

Using it:
```python
import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()
```

* Add asyncio packages to setup.py

* Issue #206: Add 'time_period' argument to 'get_bbands()' documentation. (#207)

* Fixes small documentation bugs (#208)

* fixed fx documentation

* fixed pypi badge for documentation

* prep for 2.2.0 (#209)

* fixed fx documentation

* fixed pypi badge for documentation

* prep for 2.2.0

* small documentaiton change for 2.2.0

Co-authored-by: Aaron Sanders <[email protected]>
Co-authored-by: Jon Cinque <[email protected]>
Co-authored-by: Peter Anderson <[email protected]>
PatrickAlphaC added a commit to PatrickAlphaC/alpha_vantage that referenced this pull request Apr 26, 2020
* Prep for 2.1.3 (RomelTorres#178)

* Fixing pypi badge

* Fixing typo on README

* Use ==/!= to compare str, bytes, and int literals

Identity is not the same thing as equality in Python so use ==/!= to compare str, bytes, and int literals. In Python >= 3.8, these instances will raise SyntaxWarnings so it is best to fix them now. https://docs.python.org/3.8/whatsnew/3.8.html#porting-to-python-3-8

$ python
```
>>> pandas = "panda"
>>> pandas += "s"
>>> pandas == "pandas"
True
>>> pandas is "pandas"
False
```

* added rapidapi key integration

Co-authored-by: Igor Tavares <[email protected]>
Co-authored-by: Christian Clauss <[email protected]>
Co-authored-by: Patrick Collins <[email protected]>

* Prep for 2.2.0 (RomelTorres#210)

* added rapidapi key integration

* prep for 2.1.3

* Removing get_batch_stock_quotes method (RomelTorres#189)

* Removing get_batch_stock_quotes method

Remove get_batch_stock_quotes method, resolving issue RomelTorres#184.

* remove tests for get_batch_stock_quotes

* Delete mock_batch_quotes

Was only used for the tests removed in the previous commits, which made this file dead weight.

* Add asyncio support for all components (RomelTorres#191)

* Add asyncio support for all components

In order to minimize the copy-paste between the async and sync
implementations, all of the non-base components are sym-linked to their
sync counterparts.  Since these import `AlphaVantage` locally, they
automatically pick up the async version, which is really nifty!

There was some refactoring in `SectorPerformance` to make the re-use
possible.

The async verison of AlphaVantage tries to re-use as much as possible
from the sync version.  The main differences are:
* a new `session` for the HTTP calls
* a new `close()` function which should be called when you're done
* `proxy` is a string instead of a dict

Using it:
```python
import asyncio
from alpha_vantage.async_support.timeseries import TimeSeries

async def get_data():
    ts = TimeSeries(key='YOUR_API_KEY')
    data, meta_data = await ts.get_intraday('GOOGL')
    await ts.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(get_data())
loop.close()
```

* Add asyncio packages to setup.py

* Issue RomelTorres#206: Add 'time_period' argument to 'get_bbands()' documentation. (RomelTorres#207)

* Fixes small documentation bugs (RomelTorres#208)

* fixed fx documentation

* fixed pypi badge for documentation

* prep for 2.2.0 (RomelTorres#209)

* fixed fx documentation

* fixed pypi badge for documentation

* prep for 2.2.0

* small documentaiton change for 2.2.0

Co-authored-by: Aaron Sanders <[email protected]>
Co-authored-by: Jon Cinque <[email protected]>
Co-authored-by: Peter Anderson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants