Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration with Google Calendar API #8471

Closed
JeffryMAC opened this issue Apr 20, 2020 · 22 comments · Fixed by #20542
Closed

Add integration with Google Calendar API #8471

JeffryMAC opened this issue Apr 20, 2020 · 22 comments · Fixed by #20542
Assignees
Labels
good first issue kind:feature Feature Requests provider:google Google (including GCP) related issues

Comments

@JeffryMAC
Copy link

Description

I ask for operator to create events table from the Google Calendar API
https://developers.google.com/calendar/v3/reference/events/import

Use case / motivation

we have calendar where we save events that has business impact and we need this data available for our analysts in tables.

I would also be nice if the hook will be able to insert events into the calendar based on logic from the pipeline.

@JeffryMAC JeffryMAC added the kind:feature Feature Requests label Apr 20, 2020
@mik-laj mik-laj added provider:google Google (including GCP) related issues good first issue labels Apr 20, 2020
@wojtek2kdev
Copy link

Can I try to handle that task?

@mik-laj
Copy link
Member

mik-laj commented May 7, 2020

Of course. Go ahead and start work. I will gladly help you if you have questions. 🇵🇱

@mik-laj
Copy link
Member

mik-laj commented May 7, 2020

Here is GCP integration guide:
https://docs.google.com/document/d/1_rTdJSLCt0eyrAylmmgYc3yZr-_h51fVlnvMmWqhCkY/edit
Many topics are common between Google and GCP.

@wojtek2kdev
Copy link

@JeffryMAC @mik-laj Would you help me to specify tasks more? I wonder if operator should only fetch calendar events, organize into table structure (maybe CSV format?) and return into next pipe or should persist data into some SQL table?

I would also be nice if the hook will be able to insert events into the calendar based on logic from the pipeline.

It's not clear at all for me. Need to clarify. Does it mean that hook should be able to update/insert event conditionally? For example - if there is at least one event available in particular day, then do not add next one?

@JeffryMAC
Copy link
Author

@wojtek2kdev the idea is to be able to pull events from Calendar. This is useful to make analysis if some events have effect on data (like: on 2.10.19 office closed due to employ trip). So we want our automatic process to consider events from calendar when making calculations and analysis.

On the other hand we want also the ability to post event into calendar. For example if ETL discovered issues we want to be able to call hook.post_event(recipients, title) this will create event in the recipients calander.

@wojtek2kdev
Copy link

@JeffryMAC So, it would be fine to pull events and organize them into CSV format? Then it would be easy to load to e.g pandas in further pipe. It's not about saving events into particular SQL database? Then it should looks like:
[Some task] >> [Calendar operator: return events in csv] >> [Analysis]

@mik-laj
Copy link
Member

mik-laj commented May 10, 2020

@wojtek2kdev In my opinion, JSONL is a best format in this case.

@mik-laj
Copy link
Member

mik-laj commented May 10, 2020

It should save data to GCS - Data Lake. If you have data in GCS, you can use many tool to analyze this data including BigQuery.

@wojtek2kdev
Copy link

@mik-laj Ok, I'll try that way. It sounds good.

@mik-laj
Copy link
Member

mik-laj commented May 11, 2020

@wojtek2kdev Can you check if we can use operator airflow.providers.amazon.aws.operators.google_api_to_s3_transfer.GoogleApiToS3Transfer to fetch calendar events? If yes, then we just need to write a new variant that will allow us to download data to GCS.

@wojtek2kdev
Copy link

@mik-laj I think we can. We just only need to choose calendar service and prepare desired query. So should we create Google API to GCS transfer operator? Will it be sufficient for case described in issue + calendar hook for events create/update/remove?

Additionaly, I have a few notes, one - I know that's more art than science, but I think it would be better if operators was organize by type, use-case packages e.g having s3_to_gcs or bigquery_to_gcs or bigquery_to_mysql and so on, those should be in e.g operators.transfer package, operators which only executes api CRUD operations should be in operators.api, etc. I think we can extract more operation types, and create packages to keep order, instead of having so many different operators in single folder. Just proposition.

Two - why we create hooks which only wraps API calls? For example sheets.py. Also, why we must implement get_conn method in all hooks/operators using google services, instead of inherit that or make some smart decorator?

@mik-laj
Copy link
Member

mik-laj commented May 11, 2020

@mik-laj Yes. One generic operator is more useful. I think it is also worth creating an operator that adds a calendar entry. Some organizations do not use hooks but only operators.

We had a lot of discussions on this subject. https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths
https://lists.apache.org/thread.html/f9262d9faa45fce6523bf85b7a5295a44cac419eb7fafeaeef1c7755%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/b461c5d03c2f32d3379a1f9ddacc36e5ef077dccda0e37843064e409%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/bdc1a070295121782e2e054ae62cac363ef84540b48f750916eae88a%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/2c9559184045e772acd21cbdd7435f6bf89c76eb9311311d58d16e5f%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/b07a93c9114e3d3c55d4ee514955bac79bc012c7a00db627c6b4c55f%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/0a2bccd52e7a303c2191d072029e1c1647dba76d8aae2ccef28d9780%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/df4340c1972952de5f3f69336db3c1f5064d145863f8223fda0c8137%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/d28fdba79b226c4b5e85e84a28da6ff318797b063c3990254a1a79bc%40%3Cdev.airflow.apache.org%3E
https://lists.apache.org/thread.html/4e648d9421c792d4537f5ac66f1a16dce468f816fc5221a9f9db9433%40%3Cdev.airflow.apache.org%3E
and other....

Not every hook method is just a wrapper. Some methods introduce other minor improvements, e.g. fallback for project ID or wait for result These changes most often result from a better library adaptation for Airflow. You just need to update one class and all operators will still work.
We often try to respect the KISS rule.
A separate layer in the form of a handle also makes it easier to maintain backward compatibility.

@mik-laj
Copy link
Member

mik-laj commented May 26, 2020

@wojtek2kdev Do you need any help? Do you have any questions?

@subkanthi
Copy link
Contributor

@mik-laj , is this something I can take on, Im trying to find a good first issue, I have been using airflow for a few years and have written custom operators. Just thought will start contributing.

@JeffryMAC
Copy link
Author

@wojtek2kdev are you working on this?

@JeffryMAC
Copy link
Author

@mik-laj if @subkanthi is interested can he start working on it?

@mik-laj
Copy link
Member

mik-laj commented Aug 25, 2020

@subkanthi @JeffryMAC @wojtek2kdev I assigned you to this ticket. I wish you a nice work together. :-D

@eladkal
Copy link
Contributor

eladkal commented Jun 11, 2021

Since no PR raised since August I'm clearing all Assignees.
The issue is open for anyone who wants to pick it up.

@subkanthi
Copy link
Contributor

Hey @eladkal Im back working on this, should have a PR soon.

@subkanthi subkanthi removed their assignment Oct 18, 2021
@rsg17
Copy link
Contributor

rsg17 commented Dec 18, 2021

Hi @eladkal - I am new to Airflow and would like to try and take a pass at this if no one else is working on this..

@potiuk
Copy link
Member

potiuk commented Dec 18, 2021

Fell freee - just ask @subkanthi if he is also OK

@subkanthi
Copy link
Contributor

Thanks for checking , @rsg17 please go for it, I have some code in the branch for inserting event to the calendar(if its helpful)

https://github.com/subkanthi/airflow/tree/subkanthi/google_calendar_api_8471

Fell freee - just ask @subkanthi if he is also OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue kind:feature Feature Requests provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants