-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent UDF Materialization #454
Conversation
Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA. In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin. CLA has not been signed by users: @anaghshineh |
def get_bq_routine(self, database: str, schema: str, identifier: str) -> google.cloud.bigquery.Routine: | ||
"""Get a BigQuery routine (UDF) for a schema/model.""" | ||
conn = self.get_thread_connection() | ||
# backwards compatibility: fill in with defaults if not specified | ||
database = database or conn.credentials.database | ||
schema = schema or conn.credentials.schema | ||
routine_ref = self.routine_ref(database, schema, identifier) | ||
return conn.handle.get_routine(routine_ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason you can't use info_schema.routines
? If we convert these python functions to SQL macro equivalents, we could publish this materialization as a dbt package (at least for an trial period).
https://cloud.google.com/bigquery/docs/information-schema-routines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @anaghshineh this is so cool! Before we think about merging this code in, I'd love to share this with the community. Would you be interesting in forking jaffle shop to show how this new materialization would be used? Maybe even recording a quick demo showing how the DAG changes?
another random thought. what happens if you {{ ref() }}
a UDF from another model again?
@anaghshineh This is awesome, thanks so much for taking the time & initiative to contribute. I haven't had a chance to play around with this yet, but to @dataders' point, it might already be in a place where community members could copy-paste macros (or install from a package!) to try this out and give early feedback. I think that type of feedback would be really valuable, given that we're talking about a pretty meaningful addition to the dbt user experience on modern data platforms that support persistent UDFs. To that end, I left a big comment + question over on the linked issue: #451 (comment) |
Hi @anaghshineh, I'm going to close that PR for now. We haven't reach a clear consensus on what to do with UDFs yet, and next step would be to build momentum towards that following @dataders' advice. This is fantastic work that you did here, and I feel bad about closing this PR. We will leverage your work if/when the situation evolves. |
I think this is an independent experimental implementation in a real-world public project https://tempered.works/posts/2024-02-19-udf-dbt-models/ (mentioned on Slack channel #i-made-this) |
Note discussion #10395 on support for UDFs as a materialization |
resolves #451
Description
Adds
udf
materialization in support of BQ persistent SQL UDFs. This materialization allows users to declare SQL UDFs as models and manage them usingdbt
commands, likedbt ls
,dbt compile
, anddbt run
.The new materialization takes two optional configuration arguments -
args
andreturn_type
. The former is an array of dictionaries. Each individual dictionary represents a single UDF argument. A single dictionary consists ofname
andtype
keys. Thename
key specifies the name of the argument. Thetype
key specifies the type of the argument (e.g.,.INT64
orSTRING
). An array is used to preserve the order of the arguments provided - the order in which arguments are listed in the array will be the order in which they are declared as arguments for the associated UDF. Thereturn_type
configuration argument specifies the type of the item returned by the UDF (e.g.,STRING
orSTRUCT<domain STRING, path STRING>
).Checklist
changie new
to create a changelog entry