Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce rust native sqllogictest framework #9150

Merged
merged 37 commits into from
Dec 17, 2022

Conversation

xudong963
Copy link
Member

@xudong963 xudong963 commented Dec 8, 2022

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

The ticket aims to implement rust native sqllogictest framework and replace current python version.

Reasons:

  1. Unify our codebase, all in rust!
  2. The maintainer of the python version has left databend, the python version is a little difficult to maintain.
  3. sqllogictest-rs becomes more and more stable, the guys are talented, and we can work with them to make sqllogictest-rs better

Todos:

  • refactor code: walking dir to read all test files, remove hard code and etc.
  • change current logictest file's syntax, make it standard: such as statement query II -> query II.
  • add args to support run specific test dir or file.
  • make all tests can be run under new sqllogictest framework.
  • add it to CI

After finishing the above todos, I'll make it ready for review (maybe the weekend)

Future todos:

See #9174

Closes #issue

@vercel
Copy link

vercel bot commented Dec 8, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
databend ⬜️ Ignored (Inspect) Dec 17, 2022 at 7:20AM (UTC)

@xudong963 xudong963 marked this pull request as draft December 8, 2022 10:44
@xudong963 xudong963 removed request for Xuanwo and PsiACE December 8, 2022 10:44
@mergify mergify bot added the pr-feature this PR introduces a new feature to the codebase label Dec 8, 2022
@Xuanwo
Copy link
Member

Xuanwo commented Dec 8, 2022

I like this!

@flaneur2020
Copy link
Member

+10086!

tests/sqllogictests/Cargo.toml Outdated Show resolved Hide resolved
tests/sqllogictests/suits/test.slt Outdated Show resolved Hide resolved
@Xuanwo

This comment was marked as resolved.

@flaneur2020

This comment was marked as resolved.

@xudong963
Copy link
Member Author

0b7305a The commit fixed test files format and is large. So you can ignore it when reviewing.

@xudong963
Copy link
Member Author

Support to run specific dir or test files:

➜  databend git:(new_sqllogictest) ✗ cargo run -p sqllogictests -- --help
    Finished dev [unoptimized + debuginfo] target(s) in 0.27s
     Running `target/debug/sqllogictests --help`
Mysql client starts to run...
Usage: sqllogictests [OPTIONS]

Options:
  -d, --dir <DIR>    Run sqllogictests in specific directory, the arg is optional
  -f, --file <FILE>  Run sqllogictests in specific test file, the arg is optional
  -h, --help         Print help information

@Xuanwo
Copy link
Member

Xuanwo commented Dec 9, 2022

Support to run specific dir or test files:

sqllogictest has a binary which supports pg, can we share our work with them? I think we only need to maintain the HTTP API part.

@xudong963
Copy link
Member Author

Support to run specific dir or test files:

sqllogictest has a binary which supports pg, can we share our work with them? I think we only need to maintain the HTTP API part.

Do you mean let sqllogictest-rs binary support mysql?

@Xuanwo
Copy link
Member

Xuanwo commented Dec 9, 2022

Do you mean let sqllogictest-rs binary support mysql?

Yes, we can run the binary directly instead of cargo run which need to build the runner everytime.

@xudong963
Copy link
Member Author

Do you mean let sqllogictest-rs binary support mysql?

Yes, we can run the binary directly instead of cargo run which need to build the runner everytime.

This can be advanced afterwards, perhaps until their next release, my efforts now are mainly focused on getting the new framework running inside the CI.

I opened an issue last night: risinglightdb/sqllogictest-rs#120. Should be part of your proposal

@Xuanwo
Copy link
Member

Xuanwo commented Dec 13, 2022

image

Respect! 🙇

@Xuanwo
Copy link
Member

Xuanwo commented Dec 13, 2022

I think add ability to process Regex can be moved to next PR.

@xudong963
Copy link
Member Author

I think add ability to process Regex can be moved to next PR.

Yes, it's very tricky, I think we should find a more suitable test framework to support it.

@xudong963
Copy link
Member Author

image

Respect! 🙇

Still fighting! HTTP and ck HTTP client also need to check, I think there are still many blocks.

@xudong963
Copy link
Member Author

image

It seems like a flaky test, but it can't reproduce in my local... (Btw, it's normal in old logictest)

@BohuTANG
Copy link
Member

@xudong963

If this one test is blocked for a while, we can file a tracking issue and skip it in this PR.

Copy link
Member Author

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the ticket is ready for review! Its changes contain three parts: CI, logictest files and new sqllogictest framework. The following is a summary of the ticket(hope it will help reviewers):

  1. For CI, the main changes are removing some files used by old framework and changing the commands that run new logictests. cc @Xuanwo @everpcpc @PsiACE
  2. For test files, the changes are as follows
    • Remove the semicolon after the query.
    • Remove the blank line between the query and ----.
    • Replace the comment symbol from -- to ##.
    • For those results where the output is empty, such as select '', 1; the first cell, will be displayed instead of (empty), which can reduce confusion.
    • Temporarily commenting out queries that contain dynamic results in their output, such as db1 t1 FUSE $ANYTHING $DATE NULL 0 0 0 0 0, we need a better way to handle them.
  3. For new sqllogictest framework, it's under sqllogictests dir.
    • It's based on sqllogictest-rs crate.
    • It supports three clients. ping @youngsofun for review
    • Walk dir part needs to review carefully

FYI, the future todos: #9174

@xudong963 xudong963 marked this pull request as ready for review December 17, 2022 08:06
@BohuTANG
Copy link
Member

Let's merge first, since it with a big diff.

@BohuTANG BohuTANG merged commit 18ef7e3 into databendlabs:main Dec 17, 2022
long = "run_dir",
help = "Run sqllogictests in specific directory, the arg is optional"
)]
pub dir: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can merge dir and file into a glob path? So that we will have

  • path: Vec<String> => the glob paths for tests
  • skip_path: Vec<String> => the gloab paths that need to ignore
./test -p "ydb/select**" -s "ydn/select1-1.test"

impl ClickhouseHttpClient {
pub fn create() -> Result<ClickhouseHttpClient> {
let client = Client::new();
let url = "http://127.0.0.1:8124".to_string();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an existing issue with this: #8689

It's better to support accept a Vec<String> as request URLs and we can call different node in the cluster for better performance and find more potential bug.


pub async fn query(&mut self, sql: &str) -> Result<DBOutput> {
// Client will save the following info: use database, settings (session level info)
// Then send them to server, so even though the session changes, database and settings context is correct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior seems weird. After a session has been restored, all session related changes should be reset. Is this the same behavior of a real clickhouse service?

.basic_auth("root", Some(""))
.send()
.await?;
// `res` is tsv format
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use json as a response?

pub fn create() -> Result<HttpClient> {
let mut header = HeaderMap::new();
header.insert(
"Content-Type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to use http::header::CONTENT_TYPE.

let mut response = self.response(sql, &url, true).await?;
// Set session from response to client
// Then client will same session for different queries.
self.session = serde_json::from_value(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about read the whole content via response.bytes() and deserialize as json instead of calling response.as_object() multiple times?

Or maybe just response.json()?

SelfError(String),
}

impl From<TestError> for DSqlLogicTestError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use Disconnect(#[from] io::Error) to avoid those code.

return http_client.query(sql).await;
}
println!("Running sql with clickhouse client: [{}]", sql);
self.ck_client.as_mut().unwrap().query(sql).await
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't need to check this ck_client?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we must chose on from those three engines, I prefer to use an enum here for better reading.

for handler in handlers.iter() {
match handler.as_str() {
"mysql" => {
println!("Mysql client starts to run...");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use log instead of println so users have more control on this.

env_logger::init();
let args = SqlLogicTestArgs::parse();
if let Some(handlers) = &args.handlers {
for handler in handlers.iter() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's simpler to use mysql,http,clickhouse as the default value of handlers?

@Xuanwo
Copy link
Member

Xuanwo commented Dec 20, 2022

cc @xudong963, here are some review you may be interested.

@xudong963
Copy link
Member Author

cc @xudong963, here are some review you may be interested.

Oh, I forgot it, will take a look later! Thanks @Xuanwo

@xudong963 xudong963 deleted the new_sqllogictest branch December 20, 2022 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants