Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Connector Registration and Dynamic Dispatch #1375

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alexrichey
Copy link
Contributor

@alexrichey alexrichey commented Jan 6, 2025

Implements dynamic (ie non-static / hardcoded) dispatcher for our connectors, at the lifecycle level. The idea is that if you define an destination type as, say, ginger_ftp as below
image
then the distribution code should delegate the push to a connector registered as ginger_ftp.

Impetus

This approach would enable others to register custom connectors. E.g. if another agency needed a snowflake connector, they could just register as here.

Difficulties

Emphasis on the dynamic part here. We'd lose a lot of type-safety.

Other Changes

  1. refactor lifecycle.distribute.socrata to be a generic distributor (ie there should be nothing Socrata specific). The distributor should delegate to whatever connector is registered.
  2. Move anything socrata specific, or anything that references data-validate into lifecycle.scripts.
  3. Make ConnectorProtocol Generic.

Super minor fixes

  • Noticing that sometimes in code this lifecycle stage is referred to as distribute and other times distribution. Standardize on distribute.

@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch from d4f5a7e to b92c85b Compare January 6, 2025 18:53
@fvankrieken
Copy link
Contributor

I like this a lot. I had taken a slightly different approach in #1282 and while there are things I like about my approach, as written it still tethers too closely to ingest. There are things I like about keeping it more object oriented (going for a method where we parse a Dataset from ingest/product yml and then call ds.push() or something like that, but it requires a weird dance between pydantic and our actual code that felt clunky and while we lose some type safety this way, I think I still prefer it.

@fvankrieken
Copy link
Contributor

fvankrieken commented Jan 7, 2025

So then this also aligns with where ingest should try to move towards - mainly, getting all the "template"/"config" pydantic models into dcpy.models.ingest and out of connectors, since as written they're quite specific to the ingest yml declaration, and then that seems like something that we could get to play nicely with this sort of approach.

@sf-dcp
Copy link
Contributor

sf-dcp commented Jan 7, 2025

I like this too. Is it possible to define required/optional args at a connector level rather than unpacking kwargs and receiving an error when pushing/pulling? Or would it conflict with the underlying ABC class? Thinking of having some level of type safety

@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch from b92c85b to 027e905 Compare January 9, 2025 23:16
Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 81.25000% with 12 lines in your changes missing coverage. Please review.

Project coverage is 70.85%. Comparing base (31724e0) to head (d86b864).
Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
dcpy/lifecycle/distribute/connectors.py 65.21% 8 Missing ⚠️
dcpy/connectors/ftp.py 60.00% 2 Missing ⚠️
dcpy/models/connectors/__init__.py 90.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1375      +/-   ##
==========================================
+ Coverage   70.42%   70.85%   +0.42%     
==========================================
  Files         115      124       +9     
  Lines        5979     6104     +125     
  Branches      695      706      +11     
==========================================
+ Hits         4211     4325     +114     
- Misses       1622     1629       +7     
- Partials      146      150       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch 2 times, most recently from be2cfcb to 24e1a06 Compare January 10, 2025 15:41
@alexrichey
Copy link
Contributor Author

@fvankrieken @sf-dcp I've changed things to find a nice potential middle-ground where we can type-safety, by just binding/configuring connectors at the lifecycle level. The idea being that for ingest/package/distribute we have a pretty fixed set of args that we'd pass to these dispatchers, so it makes sense to establish interfaces/protocols at the lifecycle level. That leaves us free to keep using sensible arguments in the connectors themselves.

@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch from 24e1a06 to 2a05958 Compare January 10, 2025 15:50
@alexrichey alexrichey changed the title WIP / POC of dynamic dispatch for connectors Add Connector Registration and Dynamic Dispatch Jan 10, 2025
@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch 3 times, most recently from 7f2538d to e402142 Compare January 15, 2025 16:40
also add mock dispatcher for FTP to illustrate the concept
@alexrichey alexrichey force-pushed the ar-connectors-dynamic-dispatch branch from e402142 to d86b864 Compare January 15, 2025 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

4 participants