Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug in gx integration #1675

Merged
merged 6 commits into from
Jun 28, 2023

Conversation

XinEDprob
Copy link
Contributor

@XinEDprob XinEDprob commented Jun 5, 2023

TL;DR

Fixed a bug in the integration of Flyte and Great Expectations. Without this fix, pyspark dataframe will always be converted to pandas dataframe in Great Expectations. As a result, there will have an error when people want to use SparkDFExecutionEngine.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • [] Any pending items have an associated Issue

Complete description

I use selected_datasource[0]["execution_engine"]["class_name"] to recognize the execution_engine declared by the user for Great Expectations. If it is a SparkDFExecutionEngine, the data which is in the form FlyteSchema is converted to pyspark.sql.dataframe.DataFrame, else transform the data to the default pandas.dataframe.

Tracking Issue

https://github.com/flyteorg/flyte/issues/

Follow-up issue

NA

@welcome
Copy link

welcome bot commented Jun 5, 2023

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@codecov
Copy link

codecov bot commented Jun 6, 2023

Codecov Report

Merging #1675 (6297242) into master (c8433ea) will increase coverage by 0.02%.
The diff coverage is n/a.

❗ Current head 6297242 differs from pull request most recent head 016d20b. Consider uploading reports for the commit 016d20b to get more accurate results

@@            Coverage Diff             @@
##           master    #1675      +/-   ##
==========================================
+ Coverage   71.00%   71.03%   +0.02%     
==========================================
  Files         336      336              
  Lines       30781    30798      +17     
  Branches     5576     5589      +13     
==========================================
+ Hits        21855    21876      +21     
+ Misses       8379     8375       -4     
  Partials      547      547              

see 15 files with indirect coverage changes

Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XinEDprob test is failing, mind taking a look

@XinEDprob
Copy link
Contributor Author

@pingsutw thanks for letting me know. Will take a look at the test and finish the corresponding change in this week.

@XinEDprob XinEDprob requested a review from pingsutw June 14, 2023 19:26
pingsutw
pingsutw previously approved these changes Jun 15, 2023
Signed-off-by: Xin Shi <[email protected]>
pingsutw
pingsutw previously approved these changes Jun 23, 2023
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lint error, could you run make lint and push it again?

image

@XinEDprob
Copy link
Contributor Author

@pingsutw do you have any insights about why there is an error in the test? I do not think my changes may affect that part.

@pingsutw
Copy link
Member

@XinEDprob nvm, that's a flaky test. merged it

@pingsutw pingsutw merged commit f173213 into flyteorg:master Jun 28, 2023
@welcome
Copy link

welcome bot commented Jun 28, 2023

Congrats on merging your first pull request! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants