-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enum refactoring #6158
enum refactoring #6158
Conversation
Currently And generated code for the checks could have some problems. For example, big matches are awful for debug builds, see this issue. This problem could be fixed by range matching, i.e. match x {
x if -1 <= x && x <= 2 => (),
_ => return Err(()),
}
Ok(unsafe { transmute(x) }) |
I think you need a way to write unrecognized enums into a flatbuffer. An example use case is a proxy service that talks to embedded devices that report an integer that's really an enum. Maybe after a firmware update a new int is reported and a monitoring service behind the proxy recognizes it but the proxy hasn't been updated yet. In this scenario, being unable to write an invalid enum value will lead to either an outage or a coupling of releases between the proxy service and the embedded devices's firmware. Imo Also, I think instead of transmuting in unsafe, you should safely return the base type. Maybe |
Rather than returning an int as an error, it's better to put it in an error. Will wait for more comments and make changes tomorrow. |
I will restate that I think it is a really bad idea for an enum to have an accessor to return FlatBuffer enum values are a set of integers of which you may not know all the possible values. This concept apparently does not match cleanly onto Rust enums, so I'd suggest using an implementation technique that represents them better, such as #6098 FlatBuffers has its own semantics, and it is important that all participating languages are able to access that semantics as faithfully as possible, by whatever means they have available. For example, in Java, we don't use the built-in enum feature either, because it is a heap allocated object :) Heck, Java (and most languages) can't even represent FlatBuffer strings directly, and its rather painful. |
@aardappel this PR contains direct int to enum casts exactly for the case you describe - when someone does not want to deal with a Result and trusts a buffer. It's exactly what the c++ version does. For people wanting to be extra safe, the Result version is also provided. This way we're satisfying both use cases while keeping the code idiomatic. |
As for compatibility - this change also solves this problem with both apis present. There is no way to distinguish programmatically if an incoming invalid value comes from an updated schema, or from an invalid buffer, so using the Result for people wanting to protect against such situations, gives them the possibility to do so in a Rust way without sacrificing strong enum type. |
@krojew this seems to leak Multiple accessors is not great. At the very least I'd switch the naming, such the ergonomic method has the field name, and the other one appends |
In the end, it's for you rustaceans to decide. I am just giving the FlatBuffers perspective :) |
We can tweak the naming, although in Rust, the conversion is to use _unchecked where checks are omitted, not the other way around. Given people use IDEs with code completion, I suggest going the "traditional" route, although I'm fine either way. |
If you're referring to the unsafe transmute thing, that is not forwards compatible. Its UB. Its actually Rust-only UB too since C++ is not that opinionated about enums.
I agree, the main distinction between this and #6098 is confining the handling of unrecognized variants to the I still think going the zero-copy + language neutral route in #6098 is the right way to go, but my (perhaps terrible) compromise is that we can offer the result-y thing behind the object API which is meant to be more native. @rw pls decide for us |
@CasperN for forwards compatibility the Result version is provided. Or rather will be because I forgot to put the actual invalid value in the error, which I'll fix tomorrow. The transmutation is for people trusting the buffer is valid at this point in time. The PR, in fact, combines the Rust and c++ approach into one. The user can decide which is appropriate for a given situation. |
Thanks @krojew for making this PR! Here is my perspective:
Idea: Allow users to specify a feature flag in the crate that prevents the use of unsafe code. |
Let's consider that out of scope for now. I filed #6161 to track that effort. |
@rw I absolutely agree buffer validation is the answer, but I think we should have something in place before we get one. Having c++-like casts solves the enum unsoundness issue we have for now. Also - why do you think transmute won't work for wider types? Their bit representation is the same, after all. |
I brought back thiserror dependency since it makes life easier and we're not no_std anyway. We'll cross that bridge when we get there. |
@krojew transmute isn't endian-safe and we support big-endian hosts. |
get() already takes care of endianess, so we're transmuting proper scalars. |
I'm on the fence about this. On one hand it's a legitimate use case, but on the other, it's putting invalid (at given point in time) values in the buffer in hope the receiving end will know how to interpret them. I personally see this as abusing the schema. Such fields should rather be integers than enums. @aardappel what do you think? |
rust/flatbuffers/src/error.rs
Outdated
#[derive(Error, Copy, Clone, Ord, PartialOrd, Eq, PartialEq, Hash, Debug)] | ||
pub enum ConvertError<T: Debug + Display> { | ||
#[error("invalid value in buffer: {0}")] | ||
InvalidValue(T), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be called UnknownVariant
or UnrecognizedVariant
and that it is incorrect to say a variant is "invalid" since it could be created by a future version of the schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.
} | ||
} | ||
} | ||
|
||
#[allow(non_camel_case_types)] | ||
pub const ENUM_VALUES_RACE: [Race; 4] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please deprecate ENUM_VALUES_RACE
, ENUM_NAMES_RACE
, and enum_name_race
, and make them associated constants of Race
; or leave a TODO to do so in a future pull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These might be useful. Are there reasons to deprecate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its more idiomatic for these to be associated constants with namespaced names rather than old-c-style prefixy names.
e.g. Race::VALUES
instead of ENUM_VALUES_RACE
.
While we're renaming things we should give people some opportunity to refactor their code, i.e. deprecate and give a deletion timeline.
self._tab.get::<i8>(Monster::VT_SIGNED_ENUM, Some(-1)).map(|value| value.try_into()).unwrap() | ||
} | ||
#[inline] | ||
pub unsafe fn signed_enum_unchecked(&self) -> Race { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be an method for getting the raw i8
value from signed_enum
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add reverse TryFrom.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather From, since it's infallible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather From, since it's infallible.
No, not a Enum as i8
, but instead a safe infallible getter that gets i8
from the struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
FlatBuffers semantics say that such unknown values are possible, and implementations should just ignore them if they don't know what they are. If you must assume that an unknown integer is a valid enum value "from the future", then being able to copy these values is legit as well. Or put differently, given any particular schema, there are known integer values, and every other integer value is fair game for a future enum value. Thus, no integer value (that fits the underlying type) is ever illegal. |
From my understanding, we are not interpreting, nor copying an unknown value, but producing one. |
while that is certainly more odd, it is not something that would count as illegal from the FlatBuffers side. |
This PR is an alternative solution to unsoundess in #5581. The benefits over #6098 are:
unsafe
accessor which doesn't incur any performance penalty due to transmutation. The user is responsible for ensuring the buffer is trusted.This solution is also compatible with potential future verifier returning trusted buffer types.
@rw @aardappel r?