Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of Named Type references #99

Merged
merged 4 commits into from
Feb 16, 2021
Merged

Conversation

kumarak
Copy link
Contributor

@kumarak kumarak commented Feb 11, 2021

The PR fixes the handling of Named Type references in the binary. It recovers them as a block of bytes and lifts them. Recovering the elements of named types is still in the TODO list and will be handled in future PR.

@kumarak kumarak requested a review from pgoodman February 11, 2021 17:17
lib/Lift.cpp Outdated Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
lib/Lift.cpp Outdated Show resolved Hide resolved
@kumarak kumarak marked this pull request as draft February 12, 2021 04:04
@kumarak kumarak force-pushed the handle_namedtype_references branch 4 times, most recently from 84a37d4 to 67d3834 Compare February 12, 2021 23:28
@kumarak kumarak marked this pull request as ready for review February 12, 2021 23:32
@kumarak kumarak force-pushed the handle_namedtype_references branch 4 times, most recently from 057aaf2 to 7d8f4f9 Compare February 13, 2021 00:07
lib/Lift.cpp Outdated Show resolved Hide resolved
@kumarak kumarak force-pushed the handle_namedtype_references branch from 7d8f4f9 to 3587c71 Compare February 13, 2021 03:43
@kumarak kumarak requested a review from pgoodman February 13, 2021 13:06
lib/Lift.cpp Show resolved Hide resolved
lib/Lift.cpp Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
python/anvill/binja.py Show resolved Hide resolved
python/anvill/binja.py Show resolved Hide resolved
python/anvill/binja.py Show resolved Hide resolved
python/anvill/binja.py Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
python/anvill/binja.py Outdated Show resolved Hide resolved
python/anvill/binja.py Show resolved Hide resolved
@kumarak kumarak force-pushed the handle_namedtype_references branch from 6203a10 to 4959e9c Compare February 16, 2021 02:45
Fix the handling of type cache and read bytes from memory
@kumarak kumarak requested a review from surovic February 16, 2021 15:18
@kumarak kumarak force-pushed the handle_namedtype_references branch from 4959e9c to 583a7fb Compare February 16, 2021 15:28
lib/Lift.cpp Outdated Show resolved Hide resolved
lib/Lift.cpp Outdated Show resolved Hide resolved
lib/Lift.cpp Show resolved Hide resolved
@kumarak kumarak force-pushed the handle_namedtype_references branch from eff8f26 to df3f3bc Compare February 16, 2021 16:29
@kumarak kumarak requested review from pgoodman and surovic February 16, 2021 18:38

// create an instance of precision integer of size*8 bits
llvm::APInt result(size * 8, 0);
for (auto i = 0u; i < size; ++i) {
auto byte_val = program.FindByte(addr + i).Value();
if (remill::IsError(byte_val)) {
LOG(ERROR) << "Unable to read value of byte at " << std::hex << addr + i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reset the error log to std::dec after (addr + i).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise all subsequent logs will log out line numbers in hex, which is unhelpful.

@@ -540,59 +542,55 @@ CreateConstFromMemory(const uint64_t addr, llvm::Type *type,
llvm::Constant *result{nullptr};
switch (type->getTypeID()) {
case llvm::Type::IntegerTyID: {
const auto size = dl.getTypeSizeInBits(type);
const auto size = dl.getTypeAllocSize(type);
auto val = ReadValueFromMemory(addr, size, arch, program);
result = llvm::ConstantInt::get(type, val);
} break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File an issue to handle float, double, half, x86_fp80, etc.


result = llvm::ConstantStruct::get(struct_type,
llvm::ArrayRef(const_list));
result = llvm::ConstantStruct::get(struct_type, initializer_list);
} break;

case llvm::Type::ArrayTyID: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add vector types; they're basically a copy of arrays anyway.

"""Convert an bn `Type` instance into a `Type` instance."""
if str(tinfo) in cache:
return cache[str(tinfo)]
if _cache_key(tinfo) in cache:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save _cache_key(tinfo) to a variable and re-use it everywhere in this function.

python/anvill/binja.py Show resolved Hide resolved
@pgoodman pgoodman merged commit 0036c42 into master Feb 16, 2021
@pgoodman pgoodman deleted the handle_namedtype_references branch February 16, 2021 22:18
pgoodman added a commit that referenced this pull request Feb 24, 2021
* Added support for remill kCategoryConditionalFunctionReturn, kCategoryConditionalIndirectJump, kCategoryConditionalDirectFunctionCall, kCategoryConditionalIndirectFunctionCall

* Updated case statement

* Remove Python2.7 support (#104)

* Removes any mention of python2

* Removes more mentions of python2

* IDA: Add a simple action to generate spec files (#94)

* IDA: Add a simple action to generate spec files

* docs: Update the example instructions

* Do not lift functions that are not in the JSON spec (#102)

* Modifies lifting to ignore functions that do not have mapped bytes in the spec

* Moves byte existence and executability check to LifFunction() and adds comments

* Handling of Named Type references (#99)

* Handling of named references

Fix the handling of type cache and read bytes from memory

* review changes

* Add vector type lifting

* add remill compat header for vector type

Co-authored-by: AkshayK <[email protected]>

* Refactor the CMake project (#101)

* CMake: Refactor

* CMake: Update the copyright and license headers

* CMake: Refactor

* CMake: Refactor

* Misc: Remove unused remill_commit_id file

* CMake: Refactor

* CMake: Refactor

* CMake: Refactor

* CMake: Refactor

* docs: Update the dependencies in the README

* CI: Update the GitHub Actions workflow

* CI: Update the GitHub Actions workflow

* CI: Update the GitHub Actions workflow

* Packaging: Add DEB/RPM/TGZ for Linux, TGZ for macOS

* CI: Automatically create a release when pushing a tag

* CI: Include tags when obtaining version information

* CI: Automatically abort stale workflows

* CMake: Refactor

* CI: Disable shallow clone to fix version detection

* CI: Fix Python packaging

* CMake: Refactor

* CI: Update the release generator

* CMake: Only install to system packages if not doing a dev install (#109)

* Update build.yml (#112)

Limit MacOS to LLVM 11 since we have a limited number of MacOS runners.

* Add TypeCache for bn type lookup (#108)

Adding assert to convert type function

* CMake: Update default settings, fix packaging issue (#111)

 - Enables the tests and the install target in the default configuration
 - Fixes an issues with packaging, which didn't work correctly due to
   how DESTDIR was handled

* Fix ce replace bug (#110)

* Fixes a use of replaceAllUsesOf

* Move the binja_var_none_type test to should-be passing. Also, make all stack frames packed, as the way the structure types are constructed assumes every element is adjacent in memory, with i8s explicitly filling gaps

* Give __anvill_reg_XXX variables a default initializer to make compiling bitcode possible. Get rid of overly eager, evil optimization that tries to load constants from memory into allocas. Add instcombine to the set of optimizations for folding goodness

* Adds a --print_registers_before_instuctions option to inject printfs into the bitcode to dump all address-sized integer registers to stdout before each instruction

* Move binja_var_non_type back into failing tests for now

Co-authored-by: Carson Harmon <[email protected]>

* Fix crash array size, unsupported reg, missing data var (#117)

* fix crash due to array size and unsupported reg

* Fixed assertion failure triggered in ret0.json

Co-authored-by: Peter Goodman <[email protected]>

* Fix bytesequence and copypasta (#116)

* Fix bytesequence and copypasta issues

* Do variable references again

* Update Program.h

Useless change to force CI :-P

Co-authored-by: Peter Goodman <[email protected]>

* Formats files and sets internal linkage to `__anvill_reg` globals (#118)

* CI: Update asset names when handling tags (#115)

* CI: Switch to the more reliable macOS 10.15 workers (#120)

* CI: Use a single job to publish releases (#122)

* CI: Automatically generate the release changelog (#123)

* Updated VisitConditionalDirectFunctionCall and VisitConditionalIndirectJump

* Delay slot fixes

Co-authored-by: Marek Surovič <[email protected]>
Co-authored-by: Alessandro Gario <[email protected]>
Co-authored-by: kumarak <[email protected]>
Co-authored-by: AkshayK <[email protected]>
Co-authored-by: Artem Dinaburg <[email protected]>
Co-authored-by: Peter Goodman <[email protected]>
Co-authored-by: Carson Harmon <[email protected]>
Co-authored-by: Peter Goodman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants