Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate why mir-linker test requires an unwind annotation #1978

Closed
zhassan-aws opened this issue Dec 8, 2022 · 11 comments · Fixed by #2301
Closed

Investigate why mir-linker test requires an unwind annotation #1978

zhassan-aws opened this issue Dec 8, 2022 · 11 comments · Fixed by #2301
Assignees
Labels
[C] Internal Tracks some internal work. I.e.: Users should not be affected. T-CBMC Issue related to an existing CBMC issue T-High Priority Tag issues that have high priority
Milestone

Comments

@zhassan-aws
Copy link
Contributor

This is concerned with the mir-linker test (tests/cargo-kani/mir-linker/src/lib.rs).

Before CBMC 5.72.0, this test did not require an unwind annotation.

With CBMC 5.72.0, not specifying an unwind value causes it to unwind forever (see #1941).

Investigate why.

@zhassan-aws zhassan-aws added the [C] Internal Tracks some internal work. I.e.: Users should not be affected. label Dec 8, 2022
@tautschnig
Copy link
Member

It seems that we end up with a comparison to two constant-to-pointer type casts, where one of the constants (because of how we now do simplification) ends up have a different (but still perfectly correct) bitvector type. This fools a subsequent equality test. I'm trying to figure out how to make the equality test sufficiently smart.

tautschnig added a commit to tautschnig/cbmc that referenced this issue Dec 12, 2022
This is to make sure that we can simplify equality tests over
pointer-typed expressions where at least one of them was created via a
sequence of simplification steps. Seen in
model-checking/kani#1978, but the new
simplification code is actually triggered by mm_io1, r_w_ok10, and
union12 regression tests.
@tautschnig
Copy link
Member

Fixed in diffblue/cbmc#7427.

@tautschnig tautschnig removed their assignment Dec 14, 2022
@tautschnig
Copy link
Member

It seems that we end up with a comparison to two constant-to-pointer type casts, where one of the constants (because of how we now do simplification) ends up have a different (but still perfectly correct) bitvector type. This fools a subsequent equality test. I'm trying to figure out how to make the equality test sufficiently smart.

On a second thought I have decided to discard the proposed change to CBMC. Instead, I started wondering where those pointer-typed constants with non-NULL value are coming from. It turns out Kani directly emitted that. From the symtab:

              {
                "id": "struct",
                "sub": [
                  {
                    "id": "constant",
                    "namedSub": {
                      "value": {
                        "id": "FFFFFFFFFFFFFFFF"
                      },
                      "type": {
                        "id": "pointer",
                        "sub": [
                          {
                            "id": "unsignedbv",
                            "namedSub": {
                              "width": {
                                "id": "8"
                              }
                            }
                          }
                        ],
                        "namedSub": {
                          "width": {
                            "id": "64"
                          }
                        }
                      }
                    }
                  }
                ],
                "namedSub": {
                  "type": {
                    "id": "struct_tag",
                    "namedSub": {
                      "identifier": {
                        "id": "tag-_14121113017464461636"
                      }
                    }
                  }
                }
              }

which in turn is the result of translating semver-1.0.14/src/identifier.rs around line 97. Does Rust have native pointer-typed numeric constants? In C, one would expect an integral value with a pointer type cast on top of it. Here, however, it seems that Kani generated a pointer with value 0xFFFFFFFFFFFFFFFF (and not a type cast of this integral constant to a pointer). It may still be necessary to do more work on the CBMC side, but I'd first ask that this use of pointer-typed constants be clarified.

@tedinski
Copy link
Contributor

Interesting. I expect the Rust compiler to try to maintain providence reasonably well, so I'm not (without digging in) where this is coming from. It could easily be mistranslation in Kani.

I understand that in C you'd only ever get a pointer constant by writing an integer and casting it to a pointer type. Am I understanding you correctly that this is what you'd want for goto as well? If so, I think the validation should reject a raw pointer constant.

@tautschnig
Copy link
Member

I understand that in C you'd only ever get a pointer constant by writing an integer and casting it to a pointer type. Am I understanding you correctly that this is what you'd want for goto as well? If so, I think the validation should reject a raw pointer constant.

We obviously lack a formalisation of GOTO programs at this point, but I'm not sure what semantics to attribute to a pointer-typed constant in the (not formalised) memory model of GOTO programs. The null-pointer constant we store as having a value of "NULL" (not zero!). The only other pointer-typed expression that I'd consider a legitimate constant is taking the address of an object.

@tedinski
Copy link
Contributor

Nominating this as high priority, since it might be a translation bug, and it should be simple to figure out (by asserting about types in the right constructor in our bindings)

@tedinski tedinski added the T-High Priority Tag issues that have high priority label Dec 21, 2022
@zhassan-aws
Copy link
Contributor Author

zhassan-aws commented Dec 22, 2022

Generating a pointer-typed numeric constant can be reproduced on this small program:

#[cfg_attr(kani, kani::proof)]
fn main() {
    let _p = 0x0123456789ABCDEF as *const u8;
}

Symbol table:

                                    {
                                        "id": "constant",
                                        "namedSub": {
                                            "value": {
                                                "id": "123456789ABCDEF"
                                            },
                                            "type": {
                                                "id": "pointer",
                                                "sub": [
                                                    {
                                                        "id": "unsignedbv",
                                                        "namedSub": {
                                                            "width": {
                                                                "id": "8"
                                                            }
                                                        }
                                                    }
                                                ],
                                                "namedSub": {
                                                    "width": {
                                                        "id": "64"
                                                    }
                                                }
                                            }
                                        }
                                    }

The problem doesn't seem to be coming from MIR because the cast is explicit:
The problem does seem to be coming from the MIR because it has a const literal of a pointer type:

// Item: Fn(Instance { def: Item(WithOptConstParam { did: DefId(0:3 ~ pointer[3fe5]::main), const_param_did: None }), substs: [] })
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn main() -> () {
    let mut _0: ();                      // return place in scope 0 at pointer.rs:2:11: 2:11
    let _1: *const u8;                   // in scope 0 at pointer.rs:3:9: 3:11
    scope 1 {
        debug _p => _1;                  // in scope 1 at pointer.rs:3:9: 3:11
    }

    bb0: {
        _1 = const {0x123456789abcdef as *const u8}; // scope 0 at pointer.rs:3:14: 3:45
                                         // mir::Constant
                                         // + span: pointer.rs:3:14: 3:45
                                         // + literal: Const { ty: *const u8, val: Value(Scalar(0x0123456789abcdef)) }
        return;                          // scope 0 at pointer.rs:4:2: 4:2
    }
}

Specifically these lines in the comment:

                                         // mir::Constant
                                         // + span: pointer.rs:3:14: 3:45
                                         // + literal: Const { ty: *const u8, val: Value(Scalar(0x0123456789abcdef)) }

@zhassan-aws
Copy link
Contributor Author

zhassan-aws commented Dec 22, 2022

This diff eliminates the pointer constant from the symbol table file generated by Kani:

$ git diff
diff --git a/kani-compiler/src/codegen_cprover_gotoc/codegen/operand.rs b/kani-compiler/src/codegen_cprover_gotoc/codegen/operand.rs
index 7756bc95229..0bc4a56fca9 100644
--- a/kani-compiler/src/codegen_cprover_gotoc/codegen/operand.rs
+++ b/kani-compiler/src/codegen_cprover_gotoc/codegen/operand.rs
@@ -284,7 +284,8 @@ fn codegen_scalar(&mut self, s: Scalar, ty: Ty<'tcx>, span: Option<&Span>) -> Ex
                 unreachable!("ZST is no longer represented as a scalar")
             }
             (Scalar::Int(_), ty::RawPtr(tm)) => {
-                Expr::pointer_constant(s.to_u64().unwrap(), self.codegen_ty(tm.ty).to_pointer())
+                Expr::int_constant(s.to_u64().unwrap(), Type::unsigned_int(64))
+                    .cast_to(self.codegen_ty(tm.ty).to_pointer())
             }

The resulting symbol table for the mir-linker test becomes:

                                        "id": "typecast",
                                        "sub": [
                                            {
                                                "id": "constant",
                                                "namedSub": {
                                                    "value": {
                                                        "id": "FFFFFFFFFFFFFFFF"
                                                    },
                                                    "type": {
                                                        "id": "unsignedbv",
                                                        "namedSub": {
                                                            "width": {
                                                                "id": "64"
                                                            }
                                                        }
                                                    }
                                                }
                                            }
                                        ],
                                        "namedSub": {
                                            "type": {
                                                "id": "pointer",
                                                "sub": [
                                                    {
                                                        "id": "unsignedbv",
                                                        "namedSub": {
                                                            "width": {
                                                                "id": "8"
                                                            }
                                                        }
                                                    }
                                                ],
                                                "namedSub": {
                                                    "width": {
                                                        "id": "64"
                                                    }
                                                }
                                            }
                                        }

The change passes regressions, but it does not eliminate the need to specify an unwind value. Not sure if I should create a PR for it. @tautschnig?

@tautschnig tautschnig self-assigned this Dec 22, 2022
@tautschnig
Copy link
Member

Adding this back onto my list. I'll try to see which additional changes might be necessary on the CBMC side.

@zhassan-aws
Copy link
Contributor Author

Thanks @tautschnig. And do you think it's still worthwhile to open a PR to switch to using a pointer cast for pointer constants?

@rahulku rahulku added this to the Maintenance milestone Jan 5, 2023
tautschnig added a commit to tautschnig/cbmc that referenced this issue Mar 1, 2023
In 848e633 a cast to bv was inserted to block interpreting floatbv
type casts from taking place. It was unnecessarily inserted for all
bitvector types. While this does not result in wrong semantics, it may
block simplification for happening when we end up (via other simplifier
rules) creating a bv (and not (un)signed bv) typed constant. All of
these transformations are correct, but we may end up with an equality
over pointer-typed constants where the underlying constant is a(n)
(un)signed bv on one side, and a bv on the other side. The bit patterns
match, so the back-end will correctly solve this, but the simplifier
cannot.

Observed when studying
model-checking/kani#1978.
@tautschnig
Copy link
Member

Thanks @tautschnig. And do you think it's still worthwhile to open a PR to switch to using a pointer cast for pointer constants?

With apologies for taking this long to get back to this: yes, we do still want this fix to be merged. It might, however, be a good idea to combine it with a change to the test that drops the unwind annotation. That you can drop once diffblue/cbmc#7571 is merged into CBMC and the above fix to use pointer casts is applied.

@tautschnig tautschnig added the T-CBMC Issue related to an existing CBMC issue label Mar 1, 2023
@tautschnig tautschnig assigned zhassan-aws and unassigned tautschnig Mar 1, 2023
tautschnig added a commit to tautschnig/cbmc that referenced this issue Mar 2, 2023
In 848e633 a cast to bv was inserted to block interpreting floatbv
type casts from taking place. It was unnecessarily inserted for all
bitvector types. While this does not result in wrong semantics, it may
block simplification for happening when we end up (via other simplifier
rules) creating a bv (and not (un)signed bv) typed constant. All of
these transformations are correct, but we may end up with an equality
over pointer-typed constants where the underlying constant is a(n)
(un)signed bv on one side, and a bv on the other side. The bit patterns
match, so the back-end will correctly solve this, but the simplifier
cannot.

Observed when studying
model-checking/kani#1978.
tautschnig added a commit to tautschnig/cbmc that referenced this issue Mar 6, 2023
In 848e633 a cast to bv was inserted to block interpreting floatbv
type casts from taking place. It was unnecessarily inserted for all
bitvector types. While this does not result in wrong semantics, it may
block simplification for happening when we end up (via other simplifier
rules) creating a bv (and not (un)signed bv) typed constant. All of
these transformations are correct, but we may end up with an equality
over pointer-typed constants where the underlying constant is a(n)
(un)signed bv on one side, and a bv on the other side. The bit patterns
match, so the back-end will correctly solve this, but the simplifier
cannot.

Observed when studying
model-checking/kani#1978.
tautschnig added a commit to tautschnig/kani that referenced this issue Mar 17, 2023
- Re-enable tests that had to be disabled with the toolchain upgrade in
  model-checking#2149. Fixes model-checking#2286, fixes model-checking#2191.
- Do not generate non-NULL pointer constants. Together with the CBMC
  version update this avoids the need for an unwinding annotation in the
  mir-linker test. Fixes model-checking#1978.
- CBMC 5.79.0 ships simplifier improvements that enable constant
  propagation to avoid slow-down with the Display trait. Fixes model-checking#1996.
- CBMC 5.79.0 ships SMT back-end fixes. Fixes model-checking#2002.

Co-authored-by: Zyad Hassan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C] Internal Tracks some internal work. I.e.: Users should not be affected. T-CBMC Issue related to an existing CBMC issue T-High Priority Tag issues that have high priority
Projects
No open projects
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants