-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor: Possibility to strip datafusion error name #10186
Conversation
The implementation might not be super elegant though, please share your thoughts and opininions |
@andygrove cc |
datafusion/common/src/error.rs
Outdated
assert_eq!(res.strip_error_name(), "Err"); | ||
|
||
// Test only top level stripped | ||
let res: Result<(), DataFusionError> = plan_err!("Err"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these are two scenarios that you are testing, should they go in two different test cases ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @edmondop I'd say the second scenario is variation of first one. Not sure if every variation requires its own test tbh
datafusion/common/src/error.rs
Outdated
/// Error has a format `{error_name}{`error`} | ||
/// The method strips the `error_name` from level and outputs 'error`` | ||
pub fn strip_error_name(self) -> String { | ||
self.to_string().replace(self.get_error_name(), "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better if we could avoid including the error message in the first place rather than removing it later. There is the risk of accidentally removing text that was not added by impl Display for DataFusionError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this in the first place.
-
We can fix the formatter so it will skip the error name for some format options, that doesn't work with
to_string()
which has default formatter. -
we can tweak
impl Display
with cargo feature and return name as blank if the feature enabled and that looks little bit overcompicated. -
we can refer to env variable and depending on the value define the error name but this might be expensive.
-
we can probably introduce a DF config param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure exactly what the usecase is, but if seems like if the only need is:
Sometimes it is required to fetch the original message without Datafusion error name.
Rather than trying to fix the message after it has been created, how about we add a function that just provides the underlying message? Something like
impl DataFusionError {
/// retrieve the underlying error message for this error, without the
/// type prefix.
///
/// For example, if err.to_string() returns `Error during planning: My Query Error`
/// this function would return only `My Query Error`
fn message(&self) -> &str { .. }
}
And then the impl could be
impl Display for DataFusionError {
fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
match *self {
DataFusionError::ArrowError(ref desc, ref backtrace) => {
let backtrace = backtrace.clone().unwrap_or("".to_owned());
write!(f, "Arrow error: {}{backtrace}", self.message())
}
#[cfg(feature = "parquet")]
DataFusionError::ParquetError(ref desc) => {
write!(f, "Parquet error: {}", message)
}
...
}
Maybe fn message(&self)
would have to return Cow<str>
or something to avoid copying in the external cases....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the simplest solution.
Our use case (Comet) is that we are using thiserror and we are delegating to DataFusionError for the error string and it currently uses the Display
trait.
@comphead I don't know enough about thiserror yet to know if/how we could call the new method proposed here. Do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it works even without thiserror magic, we can call .message()
method in From<>
error conversion
Co-authored-by: Andy Grove <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but agree that it would be better to rename error_name
to error_prefix
Which issue does this PR close?
Closes #.
Rationale for this change
Currently Datafusion wraps the original error message and includes Datafusion name on top of it.
If the original message is
My Query Error
the user after unwrapping theDataFusionError
will receive wrapped messageExecution error: My Query Error
orError during planning: My Query Error
, etc. Sometimes it is required to fetch the original message without Datafusion error name.What changes are included in this PR?
DataFusionError
impl expanded and includes the public methodstrip_error_name
which unwraps error into original messageAre these changes tested?
Are there any user-facing changes?