Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datafusion-json-functions as optional extension #143

Merged
merged 1 commit into from
Sep 16, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Sep 15, 2024

Closes #130

Changes:

  1. Integrates the functions https://github.com/datafusion-contrib/datafusion-functions-json
  2. Adds some basic testing for these extensions (I need to figure out how to do something similar for s3 and delta)

Ok(builder)
}

fn register_on_ctx(&self, _config: &ExecutionConfig, ctx: &mut SessionContext) -> Result<()> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main API listed on https://crates.io/crates/datafusion-functions-json/0.41.0 is in terms of SessionContext so I needed to add this

I think this experience trying to configure DataFusion externally has been a good one for me, as it is clearer how awkward it can be (not impossible, but a bit annoying). I am looking forward to bringing some of this learning back to the core with nicer APIs

//! Tests for datafusion-function-json integration

#[test]
fn test_basic() {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to write some basic tests here

@@ -43,13 +44,15 @@ url = { version = "2.5.2", optional = true }

[dev-dependencies]
assert_cmd = "2.0.16"
insta = { version = "1.40.0", features = ["yaml"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to propose using https://insta.rs/ for testing -- I have found it to be quite useful (it basically automates the updating of expected output by running cargo insta review)

@@ -46,7 +45,6 @@ impl Default for Tabs {
pub struct AppState<'app> {
pub config: AppConfig,
pub should_quit: bool,
pub data_dir: PathBuf,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is not used, and I was trying to reduce the overhead of running tests (e.g. not open files if it isn't necessary)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just double checked and that field was from the Polygon TUI I was working on and specific to that app - so good to remove

let flightsql_tab_state = FlightSQLTabState::new();
let logs_tab_state = LogsTabState::default();
let history_tab_state = HistoryTabState::default();
impl<'app> AppState<'app> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just refactored the second half of this function into AppState::new() so I could pass in a AppConfig rather than having to create one from a directory

let mut execution = TestExecution::new().with_setup(TEST_TABLE).await;

let actual = execution
.run_and_format("SELECT id, json_contains(json_col, 'b') as json_contains FROM test_table")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am quite pleased this works so well!

@alamb alamb force-pushed the alamb/json_functions branch from 9efba90 to f047cc4 Compare September 16, 2024 11:09

mod extension_cases;

/// Encapsulates an `ExecutionContext` for running queries in tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping to use this same structure to run tests for most of the other integration setups.

@alamb
Copy link
Contributor Author

alamb commented Sep 16, 2024

I also filed a PR to improve the json docs: datafusion-contrib/datafusion-functions-json#45

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this test organization

Comment on lines +43 to +54
insta::assert_yaml_snapshot!(actual, @r###"
- +----+---------------+
- "| id | json_contains |"
- +----+---------------+
- "| 1 | false |"
- "| 2 | false |"
- "| 3 | false |"
- "| 4 | true |"
- "| 5 | true |"
- +----+---------------+
"###);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not used insta before so i find this a little confusing (in particular its not clear to me what the "yaml" is doing). But I have heard good things about insta so will start looking into it.

Copy link
Contributor Author

@alamb alamb Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My TLDR is:

  1. Just follow the model (I don't really know what assert_yaml_snapshot does internally)
  2. When the test fails you run cargo insta which will a) show you a diff and b) offer to update the expected text for you.

b is the game changer -- you can update a bunch of tests simply by running cargo insta and typing a after looking at the diffs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds pretty cool!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is like sqllogictest --complete mode but for rust unit tests 🚀

@matthewmturner matthewmturner merged commit 64f583a into datafusion-contrib:main Sep 16, 2024
3 checks passed
@alamb alamb deleted the alamb/json_functions branch September 16, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add datafusion-functions-json
2 participants