Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overriding the default endianness #14

Closed
StephanvanSchaik opened this issue Mar 19, 2021 · 5 comments · Fixed by #22
Closed

Overriding the default endianness #14

StephanvanSchaik opened this issue Mar 19, 2021 · 5 comments · Fixed by #22
Labels
enhancement New feature or request
Milestone

Comments

@StephanvanSchaik
Copy link
Contributor

Would it be possible to override the big endian default for the entire crate or a particular file/module, rather than having to specify #[nom(LittleEndian)] for every single struct? It is currently pretty easy to miss specifying #[nom(LittleEndian)], and it is kind of difficult to detect this mistake without carefully looking at your code.

@chifflier
Copy link
Collaborator

Hi,
Thanks for the suggestion.
I have thought about it and tried to set attributes to the entire crate or modules, but at this point I have not decided which solution is best.

env variable

By setting an environment variable, it should be possible to set a default endianness.
I do not especially like this, because it will require adding a build.rs file, and splits information about the parser generation in multiple places.

inner attributes

Custom inner attributes would be a solution, unfortunately they are not supported in Rust (tracking issue).

Maybe there is a workaround: it should be possible to add proc_macro attributes that acts like a global switch, and change the default endianness (unless specified in the struct). Depending on how proc-macro works, this could have side-effects (all parsers would be changed at least in the same file, and possibly in files compiled after this one - this needs to be tested).

This would look like:

// [...]

#[default_little_endian] // switch default to little-endian (starting from here)
#[derive(Nom)]
pub struct AA { // will be derived as LE
}

#[derive(Nom)]
pub struct BB { // will be derived as LE
}

#[derive(Nom)]
#[nom(BigEndian)] // override endianness for this struct
pub struct CC { // will be derived as BE
}

#[default_big_endian] // switch back to big-endian by default
#[derive(Nom)]
pub struct DD { // will be derived as BE
}

Any thoughts?

@StephanvanSchaik
Copy link
Contributor Author

My understanding of how macros/custom attributes work exactly in Rust is still a bit limited, but the ideal solution would probably be something like (I am assuming that this is what you can possibly achieve with custom inner attributes):

#![default_little_endian] // where this applies to everything inside mod.
mod {
    pub struct AA {
    }
    
    pub struct BB {
    }
}

// and everything outside is still big endian.

The global switch solution does look good, given that you can do it on a file-per-file basis without the state leaking to other files. As most people would just use #[default_little_endian] once for the entire file, and probably not switch back to big endian even.

@chifflier
Copy link
Collaborator

Setting an attribute to an inline mod would be nice, but currently there are some difficulties/limitations (due to the way proc-macro/syn) works:

  • this would work for an inline mod, but not for a module defined as a file:
#[default_little_endian] // attribute accepted
pub mod parser {
    pub struct S {
    // ...
}

#[default_little_endian] // this will NOT work: non-inline modules in proc macro input are unstable
// see https://github.com/rust-lang/rust/issues/54727 for tracking issue
mod parser;

I'm afraid this kind of subtle differences would be hard to explain and could cause many bugs

  • more importantly, proc-macro and syn provide no way to get information from the module containing a struct (this is an information the compiler knows after analysis, while the crates provide helpers for syntax and parsing only). In other words, there is no easy way to tell that struct S belongs to mod parser

This is why I think the global switch is currently the least bad option. However, I need to determine what are the side effects, for example:

  • what happens if you switch the default endianness, then declare a non-inline mod?
  • More generally, do the default switch affect only the current file, or others?

If the answer is "only the current file", then I think it's a working solution, and a good compromise.

@chifflier
Copy link
Collaborator

Update on this topic:

  • it seems there is no proper solution to keep a state in a proc-macro (tacking issue: Crate local state for procedural macros? rust-lang/rust#44034). We could store values when a macro is called, but even if that seems to work, behavior may change with compiler. Most problematic points are that proc macros may not always be called (for ex in case of incremental compilation), and ordering of macros if not guaranteed.
  • as explained above, applying an attribute to a module is not reliable

Currently, I'm tempted to add new custom derive attributes (NomLE and NomBE, globally equivalent to Nom). Using namespaces, this would help achieve the same as changing globally:

use nom_derive::NomLE as Nom;

// all structs using #[derive(Nom)] will use little-endian

As this would probably mean important (and API) changes in the existing code, I'd like to think a bit before validating this.
So, this won't reach upcoming release 0.7.2, but can be planned for 0.8.0

@chifflier chifflier added the enhancement New feature or request label Mar 31, 2021
@chifflier chifflier added this to the 0.8.0 milestone Mar 31, 2021
@StephanvanSchaik
Copy link
Contributor Author

Just chiming in to say that providing NomBE and NomLE does sound like a very nice approach to this from a user's perspective, as it is very easy to drop in existing code bases. This would also work nicely in combination with conditional compilation to choose NomBE/NomLE based on feature flags or the architecture, if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants