Performance depends in surprising ways on struct definitions #1958

aaronbembenek-aws · 2022-12-04T16:51:39Z

Subtle differences in the definitions of structs---even when data of that type is never created---excessively affects Kani's performance.

I tried this code:

enum Expr {
    A,
    B(Box<Expr>),
    C(Box<Expr>, Box<Expr>),
    D(Box<Expr>, Box<Expr>, Box<Expr>),
    E(Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>),
}

enum Result<S, T> {
    Ok(S),
    Err(T),
}

enum Err<X, Y, Z> {
    A(X),
    B(Y, Z),
}

type Err1 = Err<String, String, String>;
type Err2<'a> = Err<String, &'a str, String>;
type Err3<'a> = Err<String, String, &'a str>;
type Err4<'a> = Err<String, &'a str, &'a str>;
type Err5<'a> = Err<&'a str, String, String>;
type Err6<'a> = Err<&'a str, &'a str, String>;
type Err7<'a> = Err<&'a str, String, &'a str>;
type Err8<'a> = Err<&'a str, &'a str, &'a str>;
type Err9<'a> = Err<Expr, &'a str, String>;
type Err10<'a> = Err<Box<Expr>, &'a str, String>;

// Takes >10s
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn slow_harness1() {
    let _: Result<Expr, Err2> = Result::Ok(Expr::A);
}

// Takes >10s
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn slow_harness2() {
    let _: Result<Expr, Err9> = Result::Ok(Expr::A);
}

// Takes ~1s
#[cfg_attr(kani, kani::proof, kani::unwind(2))]
fn fast_harness() {
    let _: Result<Expr, Err1> = Result::Ok(Expr::A);
    //let _: Result<Expr, Err2> = Result::Ok(Expr::A);
    let _: Result<Expr, Err3> = Result::Ok(Expr::A);
    let _: Result<Expr, Err4> = Result::Ok(Expr::A);
    let _: Result<Expr, Err5> = Result::Ok(Expr::A);
    let _: Result<Expr, Err6> = Result::Ok(Expr::A);
    let _: Result<Expr, Err7> = Result::Ok(Expr::A);
    let _: Result<Expr, Err8> = Result::Ok(Expr::A);
    //let _: Result<Expr, Err9> = Result::Ok(Expr::A);
    let _: Result<Expr, Err10> = Result::Ok(Expr::A);
}

using the following command line invocations:

kani file.rs --harness slow_harness1
kani file.rs --harness slow_harness2
kani file.rs --harness fast_harness

with Kani v0.16.0 (commit #55ba7d64) and CBMC v5.71.0.

I expected to see this happen: the time it takes to verify

let _: Result<Expr, T> = Result::Ok(Expr::A);

does not depend on the instantiation of the type parameter T.

Instead, this happened: the time it takes varies greatly with what T is instantiated with.

It takes >10s if T is instantiated with Err2:

struct Err2<'a> {
    A(String),
    B(&'a str, String)
}

It takes <1s if T is instantiated with Err3:

struct Err3<'a> {
    A(String),
    B(String, &'a str)
}

Furthermore, performance also depends on surprising ways on the type Expr. For example, say we add another variant:

enum Expr {
    A,
    B(Box<Expr>),
    C(Box<Expr>, Box<Expr>),
    D(Box<Expr>, Box<Expr>, Box<Expr>),
    E(Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>),
    F(Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>), // new variant
}

Now, slow_harness1 is actually faster than before - it takes less than 1s. However, if we instead tweak the new variant by dropping one of the arguments, performance is much worse (40s):

enum Expr {
    A,
    B(Box<Expr>),
    C(Box<Expr>, Box<Expr>),
    D(Box<Expr>, Box<Expr>, Box<Expr>),
    E(Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>),
    F(Box<Expr>, Box<Expr>, Box<Expr>, Box<Expr>), // new variant
}

The text was updated successfully, but these errors were encountered:

tautschnig · 2022-12-05T14:00:39Z

In fast_harness, and now just taking the case let _: Result<Expr, Err3> = Result::Ok(Expr::A); there is an assignment

var_1.case = 0;

(which eventually controls the branch taken by drop_in_place). In slow_harness1 the related assignment that will control the behaviour is

*((unsigned long int *)&var_1) = 2;

I don't know why such different code is generated, but it seems that CBMC does not currently succeed in constant-propagating the latter case. I'll see what can be done about this on the CBMC side, but you might also want to investigate why Kani produces such different code here.

This fixes a bits/bytes confusing, resulting in missed optimisations of byte_extract(byte_update(...)) nestings when the byte_extract was at an offset higher than the update. Found while debugging model-checking/kani#1958.

tautschnig · 2022-12-05T14:29:44Z

Hmm, perhaps the reason for the above code is that the element at offset 0 is padding:

Symbol......: tag-_4307619747238004211::Ok
Pretty name.: Result<Expr, Err<std::string::String, &str, std::string::String>>::Ok
Module......:
Base name...: _4307619747238004211::Ok
Mode........: C
Type........: struct _4307619747238004211::Ok { unsigned long int $pad0; struct _18071228790958698154 0; }
Value.......:
Flags.......: type auxiliary
Location....:

I have no idea why Kani inserts padding at the beginning of this struct, but this at least explains why it isn't possible to use a member access here.

Anyway, diffblue/cbmc#7411 fixes the problem of the missing simplification/constant-propagation on the CBMC side.

aaronbembenek-aws · 2022-12-05T15:18:16Z

Amazing - thanks @tautschnig for taking a look and quickly coming up with a fix! I'll check with the rest of the Kani team about the struct padding and open an issue if it's unexpected behavior.

This fixes a bits/bytes confusing, resulting in missed optimisations of byte_extract(byte_update(...)) nestings when the byte_extract was at an offset higher than the update. Found while debugging model-checking/kani#1958.

celinval · 2022-12-05T18:21:36Z

We should only close this once the issue has been fixed and a test has been added to our regression.

tedinski · 2022-12-05T18:47:35Z

It looks like we do some pretty worst-case accesses to discriminants:

kani/kani-compiler/src/codegen_cprover_gotoc/codegen/rvalue.rs

Lines 582 to 595 in 886618c

    
               /// Extract the niche value from `v`. This value should be of type `niche_ty` and located 
        
               /// at byte offset `offset` 
        
               pub fn codegen_get_niche(&self, v: Expr, offset: Size, niche_ty: Type) -> Expr { 
        
                   if offset == Size::ZERO { 
        
                       v.reinterpret_cast(niche_ty) 
        
                   } else { 
        
                       v // t: T 
        
                           .address_of() // &t: T* 
        
                           .cast_to(Type::unsigned_int(8).to_pointer()) // (u8 *)&t: u8 * 
        
                           .plus(Expr::int_constant(offset.bytes(), Type::size_t())) // ((u8 *)&t) + offset: u8 * 
        
                           .cast_to(niche_ty.to_pointer()) // (N *)(((u8 *)&t) + offset): N * 
        
                           .dereference() // *(N *)(((u8 *)&t) + offset): N 
        
                   } 
        
               }

It could be this is in the category of "things we translate poorly" for CBMC?

tautschnig · 2022-12-06T12:50:24Z

It looks like we do some pretty worst-case accesses to discriminants: [...]

I don't know how hard it would be to do better than this, but the current approach will quite heavily rely on CBMC's capability to rewrite and simplify expressions. Producing something more high-level seems worth investigating.

aaronbembenek-aws added [E] Performance Track performance improvement (Time / Memory / CPU) [C] Feature / Enhancement A new feature request or enhancement to an existing feature. labels Dec 4, 2022

tautschnig mentioned this issue Dec 5, 2022

fixup! Simplify byte_extract(byte_update(...)) without overlap diffblue/cbmc#7411

Merged

3 tasks

aaronbembenek-aws closed this as completed Dec 5, 2022

celinval reopened this Dec 5, 2022

celinval added the T-CBMC Issue related to an existing CBMC issue label Dec 5, 2022

zhassan-aws added a commit to zhassan-aws/kani that referenced this issue Dec 8, 2022

Add test for model-checking#1958

b67defb

zhassan-aws mentioned this issue Dec 8, 2022

Bump CBMC to version 5.72.0 #1941

Merged

4 tasks

zhassan-aws closed this as completed in #1941 Dec 9, 2022

aaronbembenek-aws mentioned this issue Dec 10, 2022

Different enum layouts result in 10x difference in verification time #1982

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance depends in surprising ways on struct definitions #1958

Performance depends in surprising ways on struct definitions #1958

aaronbembenek-aws commented Dec 4, 2022

tautschnig commented Dec 5, 2022

tautschnig commented Dec 5, 2022

aaronbembenek-aws commented Dec 5, 2022

celinval commented Dec 5, 2022

tedinski commented Dec 5, 2022

tautschnig commented Dec 6, 2022

Performance depends in surprising ways on struct definitions #1958

Performance depends in surprising ways on struct definitions #1958

Comments

aaronbembenek-aws commented Dec 4, 2022

tautschnig commented Dec 5, 2022

tautschnig commented Dec 5, 2022

aaronbembenek-aws commented Dec 5, 2022

celinval commented Dec 5, 2022

tedinski commented Dec 5, 2022

tautschnig commented Dec 6, 2022