-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding upload_file adapter #121
Conversation
@McKnight-42 Can I get some eyes on this please? Eager to see this functionality added to |
Hi @pgoslatara taking a look over today and will pass along some questions sometime tomorrow, sorry for the delay. |
@jtcohen6 I'm curious if you have any thoughts on this one way or the other? |
@pgoslatara good catch on the missing For the |
@McKnight-42 I've added a basic test for uploading three different types of files (CSV, NDJSON and parquet), bfe11ba. I've never previously worked with tests so this is all new to me. Can you take a look and see if these make sense? |
@pgoslatara Great start on the tests though I feel like we need to go a little further, possibly adding calls to the newly created tables and to do some simple check that they actually have some information just incase a error happens and database only creates a empty table. |
This is a great suggestion. I've added checks for the number of rows, distinct |
@pgoslatara sorry about delay, Great work on this. I've tested updates locally and they are looking good, I believe all thats left is for you to update the changelog and mark off the rest of the checklist. |
@McKnight-42 Done! Thanks for your input on this one, I learned a lot about how to think about tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests both success of load into database and pull down of table information to make sure its all there, covers many forms of upload, well done!
* Adding upload_file adapter * flake8 formatting * Replacing get_timeout with newer get_job_execution_timeout_seconds * Adding integration tests for upload_file macro * Removing conn arg from table_ref method * Updating schema method to upload_file * Adding checks on created tables * Correcting class name * Updating CHANGELOG.md Co-authored-by: Matthew McKnight <[email protected]>
resolves #102
Description
Adding an
upload_file
adapter that uses the load_table_from_file method. This adapter takes multiple arguments allowing maximum customisation of the LoadJobConfig class.Example files:
How to use:
Open questions from my side:
load_dataframe
which does not have tests. Should an issue be opened to add tests for bothload_dataframe
andupload_file
?Comment:
Neither
manifest.json
orrun_results.json
can be uploaded using this adapter as they do not conform to ndjson specifications. If this PR is merged an issue can be opened to address this (possibly alter these files before calling theupload_file
adapter).Checklist
CHANGELOG.md
and added information about my change to the "dbt-bigquery next" section.