Skip to content

Commit

Permalink
Update docs wit ASUpdater.py script (#2217)
Browse files Browse the repository at this point in the history
  • Loading branch information
Rot127 authored Jan 7, 2024
1 parent 15d9337 commit 0d0edad
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 206 deletions.
2 changes: 1 addition & 1 deletion HACK.TXT
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Updating an Architecture
The update tool for Capstone is called `auto-sync` and can be found in `suite/auto-sync`.

Not all architectures are supported yet.
Run `suite/auto-sync/Update-Arch.sh -h` to get a list of currently supported architectures.
Run `suite/auto-sync/Updater/ASUpdater.py -h` to get a list of currently supported architectures.

The documentation how to update with `auto-sync` or refactor an architecture module
can be found in [docs/AutoSync.md](docs/AutoSync.md).
Expand Down
2 changes: 2 additions & 0 deletions requirements.txt → dev_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
tree-sitter==0.20.1
termcolor==2.2.0
cmake==3.27.9
ninja==1.11.1.1
19 changes: 10 additions & 9 deletions docs/AutoSync.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ Rebase `llvm-capstone` onto the new LLVM release (if not already done).
```
# 1. Clone Capstone's LLVM
git clone https://github.com/capstone-engine/llvm-capstone
cd llvm-capstone
git checkout auto-sync
# 2. Rebase onto the new LLVM release and resolve the conflicts.
Expand All @@ -127,14 +129,9 @@ cd build
cmake -G Ninja -DLLVM_TARGETS_TO_BUILD=<ARCH> -DCMAKE_BUILD_TYPE=Debug ../llvm
cmake --build . --target llvm-tblgen --config Debug
# 4. Run git log and copy the hash of the release commit for the next step.
git log
# 5. Run the updater
# 4. Run the updater
cd ../../suite/auto-sync/
mkdir build
cd build
../Update-Arch.sh <ARCH> <PATH-TO-LLVM> <LLVM-RELEASE_HASH>
./Updater/ASUpdater.py -a <ARCH>
```

The update script will execute the steps described above and copy the new files to their directories.
Expand All @@ -154,10 +151,14 @@ Issue: https://github.com/capstone-engine/capstone/issues/1984

To refactor an architecture to use `auto-sync`, you need to add it to the configuration.

1. Add the architecture to the supported architectures list in `Update-Arch.sh`.
1. Add the architecture to the supported architectures list in `ASUpdater.py`.
2. Configure the `CppTranslator` for your architecture (`suite/auto-sync/CppTranslator/arch_config.json`)

Now, manually run the update commands within `Update-Arch.sh` but *skip* the `Differ` step.
Now, manually run the update commands within `ASUpdater.py` but *skip* the `Differ` step:

```
./Updater/ASUpdater.py -a <ARCH> -s IncGen Translate
```

The task after this is to:

Expand Down
47 changes: 30 additions & 17 deletions suite/auto-sync/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,13 @@ sudo apt install python3-venv
# Setup virtual environment in Capstone root dir
python3 -m venv ./.venv
source ./.venv/bin/activate
pip3 install -r requirements.txt
pip3 install -r dev_requirements.txt
```

Clone C++ grammar

```
cs suite/auto-sync/
cd suite/auto-sync/
git submodule update --init --recursive ./vendor/
```

Expand All @@ -36,7 +36,7 @@ git submodule update --init --recursive ./vendor/
Check if your architecture is supported.

```
./Updater/Updater.py -h
./Updater/ASUpdater.py -h
```

Clone Capstones LLVM fork and build `llvm-tblgen`
Expand All @@ -56,10 +56,7 @@ cd ../../
Run the updater

```
TODO: REWORK
mkdir build
cd build
../Update-Arch.sh <ARCH> ./llvm-capstone
./Updater/ASUpdater.py -a <ARCH>
```

## Post-processing steps
Expand All @@ -78,23 +75,37 @@ This is a rough overview what files of an architecture are updated and where the

**Files originating from LLVM** (Automatically updated)

TODO: The "<ARCH>LLVM*" files are not renamed yet.
These files are LLVM source files which were translated from C++ to C
Not all the listed files below are used by each architecture.
But those are the most common.

- `<ARCH>Disassembler.*`: Bytes to `MCInst` decoder.
- `<ARCH>InstPrinter.*` or `<ARCH>AsmPrinter.*`: `MCInst` to asm string decoder.
- `<ARCH>BaseInfo.*`: Commonly use functions and definitions.

`*.inc` files are exclusively generated by LLVM TableGen backends:

- `<ARCH>LLVM*.*`: These files are LLVM source files which were translated from C++ to C.
- Because the translation is not perfect, those files need some hands on work afterwards (see below).
- `<ARCH>Gen*.inc`: These files are exclusively generated by LLVM TableGen backends.
`*.inc` files for the LLVM component are named like this:
- `<ARCH>Gen*.inc` (note: no `CS` in the name)

These files form the actual disassembler and assembler printer.
Additionally, we generate more details for Capstone with `llvm-tblgen`.
Like enums, operand details and other things.

They are saved also to `*.inc` files, but have the `CS` in the name to make them distinct from the LLVM generated files.

- `<ARCH>GenCS*.inc`

**Capstone module files** (Not automatically updated)

- `<ARCH>Mapping.*`: Binding code between the architecture module and the LLVM files.
- `<ARCH>Module.*`: Interface for the Capstone core.
- `<ARCH>DisassemblerExtension.*` All kind of functions which are needed by `<ARCH>LLVMDisassembler.c` but could not be generated or translated.
Those files are written by us:

- `<ARCH>DisassemblerExtension.*` All kind of functions which are needed by the LLVM component, but could not be generated or translated.
- `<ARCH>Mapping.*`: Binding code between the architecture module and the LLVM files. This is also where the detail is set.
- `<ARCH>Module.*`: Interface to the Capstone core.

### Update procedure

1. Run the `Update-Arch.sh` script.
1. Run the `ASUpdater.py` script.
2. Compare the functions in `<ARCH>DisassemblerExtension.*` to LLVM (search the function names in the LLVM root)
and update them if necessary.
3. Try to build Capstone and fix the build errors.
Expand All @@ -109,7 +120,9 @@ For details about the C++ to C translation of the LLVM files refer to `CppTransl

Documentation about the `.inc` file generation is in the [llvm-capstone](https://github.com/capstone-engine/llvm-capstone) repository.

- If some features were not generated and are missing in the `.inc` files, make sure they are defined as `AssemblerPredicate` in the `.td` files.
**Troubleshooting**

- If some features aren't generated and are missing in the `.inc` files, make sure they are defined as `AssemblerPredicate` in the `.td` files.

Correct:
```
Expand Down
10 changes: 8 additions & 2 deletions suite/auto-sync/Updater/PathVarHandler.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ def __init__(self) -> None:
# Load variables
with open(path_config_file) as f:
vars = json.load(f)

missing = list()
for p_name, path in vars.items():
resolved = path
for var_id in re.findall(r"\{.+}", resolved):
Expand All @@ -45,9 +47,13 @@ def __init__(self) -> None:
resolved = re.sub(var_id, str(self.paths[var_id]), resolved)
log.debug(f"Set {p_name} = {resolved}")
if not Path(resolved).exists():
log.fatal(f"Path from config file does not exist! Path: {resolved}")
exit(1)
missing.append(resolved)
self.paths[p_name] = resolved
if len(missing) > 0:
log.fatal(f"Some paths from config file are missing!")
for m in missing:
log.fatal(f"\t{m}")
exit(1)

def get_path(self, name: str) -> Path:
if name not in self.paths:
Expand Down
177 changes: 0 additions & 177 deletions suite/auto-sync/Updater/Update-Arch.sh

This file was deleted.

1 change: 1 addition & 0 deletions suite/auto-sync/vendor/tree-sitter-cpp
Submodule tree-sitter-cpp added at a71474

0 comments on commit 0d0edad

Please sign in to comment.