Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deriving Mapping Mutators #2585

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

riesentoaster
Copy link
Contributor

I'm attempting to derive everything necessary for havoc-style mutators on custom inputs using mapping mutators introduced in #2422.

Still to do:

  • Fix current bug
  • Introduce more struct types other than Vec<u8>
  • Documentation

@riesentoaster
Copy link
Contributor Author

The current problem I'm running into is that the types of the combined mutators have to be specified in the macro — generics don't seem to work or at least I couldn't figure out how to do it.

Current Error Logs

error[E0308]: mismatched types
   --> src/main.rs:3:10
    |
3   | #[derive(HasHavocMutators)]
    |          ^^^^^^^^^^^^^^^^
    |          |
    |          expected type parameter `MT`, found `(MappedInputFunctionMappingMutator<BitFlipMutator, ..., ...>, ...)`
    |          arguments to this method are incorrect
    |
    = note: expected type parameter `MT`
                        found tuple `(MappedInputFunctionMappingMutator<BitFlipMutator, fn(&mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteFlipMutator, fn(&mut CustomInput) -> ... {...::vec_mut}, ...>, ...))`
    = note: the full type name has been written to '/Users/valentinhuber/GitHub/testt/target/debug/deps/testt-c72bf13c033b3f72.long-type-18083438836358437311.txt'
help: the return type of this call is `(MappedInputFunctionMappingMutator<BitFlipMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteFlipMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteIncMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteDecMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteNegMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteRandMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteAddMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<WordAddMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<DwordAddMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<QwordAddMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<ByteInterestingMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<WordInterestingMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<DwordInterestingMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesDeleteMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesDeleteMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesDeleteMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesDeleteMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesExpandMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesInsertMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesRandInsertMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesSetMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesRandSetMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesCopyMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesInsertCopyMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<BytesSwapMutator, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<MappedCrossoverInsertMutator<for<'a> fn(&'a CustomInput) -> &'a [u8] {CustomInput::vec}, &[u8]>, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, (MappedInputFunctionMappingMutator<MappedCrossoverReplaceMutator<for<'a> fn(&'a CustomInput) -> &'a [u8] {CustomInput::vec}, &[u8]>, for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a> {CustomInput::vec_mut}, MutVecInput<'_>>, ())))))))))))))))))))))))))))` due to the type of the argument passed
   --> src/main.rs:3:10
    |
3   | #[derive(HasHavocMutators)]
    |          ^^^^^^^^^^^^^^^^ this argument influences the return type of `merge`
note: method defined here
   --> /Users/valentinhuber/GitHub/LibAFL/libafl_bolts/src/tuples.rs:709:8
    |
709 |     fn merge(self, value: T) -> Self::MergeResult;
    |        ^^^^^
    = note: this error originates in the derive macro `HasHavocMutators` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0308`.
error: could not compile `testt` (bin "testt") due to 1 previous error

And this is just with a custom input containing a single Vec. It would be somewhat more doable if there was .merge()ing on tuple_list_type, otherwise I don't see a way around a good bit of code duplication or very ugly additional macro fun.

If anyone wants to suggest ideas or take a look at the current situation, please do!

@domenukk
Copy link
Member

domenukk commented Oct 7, 2024

This is cool stuff!

It might make sense to take a good look at how Arbitrary works for different types.
https://github.com/rust-fuzz/arbitrary/tree/main/src/foreign/core

Also, if we want to do this perfectly, we probably want different mutators for each type at some point.
For example for numbers using the full might of havoc seems useless, for &[u8] we probably need a subset of mutators that always keep the testcase size constant...

@riesentoaster
Copy link
Contributor Author

It might make sense to take a good look at how Arbitrary works for different types. https://github.com/rust-fuzz/arbitrary/tree/main/src/foreign/core

What are you suggesting with this? Instead of using mapping mutators just using a simple BytesInput and then parsing that back into a complex type using Arbitrary? Not sure I understand the connection between it and this PR.

Also, if we want to do this perfectly, we probably want different mutators for each type at some point. For example for numbers using the full might of havoc seems useless, for &[u8] we probably need a subset of mutators that always keep the testcase size constant...

For sure, that's the goal. I'm probably going to implement mutators for struct parts of types Vec<u8>, Option<Vec<u8>> and bool for now (because I need them), but in principle, this should be fairly easily extensible for any types that have defined mutators, so even Input structs that combine byte array and AST types would be possible.

I hope to basically build a foundation with this, that should be fairly flexible and extensible for whatever else, basically defining default mutators for each type (or type combination).

@domenukk
Copy link
Member

domenukk commented Oct 7, 2024

My point about Arbitrary is that they have derives for all possible types.
Instead of coming up with our own derives it makes sense to look at how they do it and maybe copy&paste that (?)
And then of course change it to auto-generate mutators / understand the structure instead of just using Arbitrary

@riesentoaster
Copy link
Contributor Author

Ah, that makes sense. Thank you for the suggestion! I'll see what I can do.

use super::HasHavocMutators;
use crate::mutators::{StdScheduledMutator, Vec};

#[derive(HasHavocMutators)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just say #[derive(Mutators)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about DefaultMutators? Depending on what you're trying to accomplish, you may want different implementations.

@riesentoaster riesentoaster force-pushed the derive-mapping-mutators branch from 1fb2c28 to 3f86aaf Compare October 8, 2024 09:23
@riesentoaster
Copy link
Contributor Author

So, merging tuple_list_types was actually fairly straightforward.

Now I'm back to struggling with timelines:

pub trait DefaultMutators {
    type Type;
    fn default_mutators() -> Self::Type;
}

#[derive(Debug, Deserialize, Serialize, SerdeAny, Clone)]
struct CustomInput {
    vec: Vec<u8>,
}

impl CustomInput {
    fn vec_mut(&mut self) -> MutVecInput<'_> {
        (&mut self.vec).into()
    }

    fn vec(&self) -> &[u8] {
        &self.vec
    }
}

impl DefaultMutators for CustomInput {
    type Type = MappedHavocMutationsType<
        for<'a> fn(&'a mut CustomInput) -> MutVecInput<'a>,
        fn(&CustomInput) -> &[u8],
        MutVecInput<'a>,
        &'a [u8],
    >;

    fn default_mutators() -> Self::Type {
        mapped_havoc_mutations(CustomInput::vec_mut, CustomInput::vec)
    }
}

The issue here are the lifetimes for MutVecInput<'a> and &'a [u8]. I need to specify them based on how mapping mutators are setup for now, specifically, this is fundamentally again because we have to ensure that the lifetimes of input and output of the mapping functions match, even if we return an owned wrapper around a reference (such as MutVecInput).

To ensure this, we introduced these weird traits that are nothing but a type with a lifetime (see MappedInput and IntoOptionBytes). If we could just always pass references instead of owned types, this would not be necessary, and I could thus write MappedHavocMutationsType in a way that only needs the first two generic types. However, this would mean that we could no longer allow any generic owned wrapper type (such as MutVecInput). This would lead to custom implementations of mapping mutators for each concrete wrapper type, thus we're back to code duplication. On the other hand, it would remove the necessity for these weird hacks with MappedInput and IntoOptionBytes.

If nobody (@domenukk?) has an idea of how to deal with this, I think this is the only way forward.

@riesentoaster
Copy link
Contributor Author

I may have another idea, hold on.

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Oct 8, 2024

So I may have found a solution, but I needed to tear the entire way generics are handled in mapping mutators apart.

I've also now introduced the following trait instead of HasHavocMutators:

/// Extensions of [`crate::inputs::Input`]s that have default mutators
pub trait DefaultMutators<S, MT>: Sized {
    /// Get the default mutators for this type
    fn default_mutators() -> MT
    where
        MT: MutatorsTuple<Self, S>,
        S: HasCorpus,
        <S as HasCorpus>::Corpus: Corpus<Input = Self>;
}

There is also a proof of concept implementation for a custom input in test case in havoc_mutations.rs. Do you think this interface makes sense? Other suggestions?

We're currently back at having to specify types for the outer mapping mutator interface, but I hope to fix this at some point.

(We may get away with removing some of the generics, I'll have to check the compiler's opinion.)

@riesentoaster
Copy link
Contributor Author

So I may have found a solution, but I needed to tear the entire way generics are handled in mapping mutators apart.

I've also now introduced the following trait instead of HasHavocMutators:

/// Extensions of [`crate::inputs::Input`]s that have default mutators
pub trait DefaultMutators<S, MT>: Sized {
    /// Get the default mutators for this type
    fn default_mutators() -> MT
    where
        MT: MutatorsTuple<Self, S>,
        S: HasCorpus,
        <S as HasCorpus>::Corpus: Corpus<Input = Self>;
}

There is also a proof of concept implementation for a custom input in havoc_mutations.rs. Do you think this interface makes sense? Other suggestions?

We're currently back at having to specify types for the outer mapping mutator interface, but I hope to fix this at some point.

@domenukk
Copy link
Member

domenukk commented Oct 8, 2024

  • We usually use Std instead of Default everywhere to have a distinction to the rust Default trait.
  • I guess binding the type to some mutators makes sense, yeah. It could be cool if the user can overwrite them somehow(?)

@@ -42,7 +43,7 @@ use {
/// Coverage map with explicit assignments due to the lack of instrumentation
const SIGNALS_LEN: usize = 16;
static mut SIGNALS: [u8; SIGNALS_LEN] = [0; 16];
static mut SIGNALS_PTR: *mut u8 = addr_of_mut!(SIGNALS) as _;
static mut SIGNALS_PTR: *mut u8 = unsafe { addr_of_mut!(SIGNALS) as _ };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance you have an old compiler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so. The unsafe is not necessary when built with +nightly, but it breaks without.

$ rustup --version
rustup 1.27.1 (54dd3d00f 2024-04-24)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.81.0 (eeb90cda1 2024-09-04)`
$ cargo --version
cargo 1.81.0 (2dbb1af80 2024-08-20)
error[E0133]: use of mutable static is unsafe and requires unsafe function or block
  --> src/main.rs:46:48
   |
46 | static mut SIGNALS_PTR: *mut u8 = addr_of_mut!(SIGNALS) as _;
   |                                                ^^^^^^^ use of mutable static
   |
   = note: mutable statics can be mutated by multiple threads: aliasing violations or data races will cause undefined behavior

For more information about this error, try `rustc --explain E0133`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd, it doesn't do that for me

@riesentoaster
Copy link
Contributor Author

  • We usually use Std instead of Default everywhere to have a distinction to the rust Default trait.

I get that for the standard implementations overall, independent of other types or parts of the fuzzer. In this case, I personally think it's closer to Default, since the mutators are fairly closely tied to the input type. If there were mutators that worked with any type, then I'd call them std_mutators(). This is only a very slight preference though, I'll gladly change them if you want to.

  • I guess binding the type to some mutators makes sense, yeah. It could be cool if the user can overwrite them somehow(?)

I took out a lot of unnecessary constraints, including all on DefaultMutators. I also moved to an associated type from a generic, because it removes code duplication in the implementations. It now looks like this:

/// Extensions of [`crate::inputs::Input`]s that have default mutators
pub trait DefaultMutators {
    /// The resulting mutator list type
    type Type;
    /// Get the default mutators for this type
    #[must_use]
    fn default_mutators() -> Self::Type;
}

Regarding overwriting: You can always implement other mutators for your input type and/or combine them with a subset of existing ones.

Also: I can't for the life of me figure out how to fix the typing such that users don't have to specify their types — the fix from last time doesn't work anymore because of the different generic architecture. But also: this is not an issue if you just use the default_mutators, since the types are specified in there. So maybe it's ok?

@domenukk
Copy link
Member

domenukk commented Oct 9, 2024

Do you have an example where the mutators are derived?

@riesentoaster
Copy link
Contributor Author

Not yet, no — that's the end goal of this PR. But I need something to derive first, so I'm writing the interface, then a manual implementation in a test case, and then I'll continue with the macro.

@riesentoaster riesentoaster force-pushed the derive-mapping-mutators branch from 2a14db9 to 4593747 Compare December 11, 2024 11:05
@riesentoaster riesentoaster force-pushed the derive-mapping-mutators branch from 4593747 to 9eea795 Compare December 11, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants