Scan for secrets, endpoints, and other sensitive data after decompiling and deobfuscating Android files. (.apk, .xapk, .dex, .jar, .class, .smali, .zip, .aar, .arsc, .aab, .jadx.kts).
apkscan-demo-short.mov
- Why use APKscan?
- Features
- Installation
- Usage
- Configuring Scanning Rules
- Configuring Decompilers
- Concurrency and Performance
- Contributing
- License
APKs (Android Package Kits) often leak secrets due to over-reliance on security through obscurity. Developers sometimes leave sensitive information such as API keys, tokens, and credentials hidden within the code, assuming that they won't be found easily since the code has been compiled and obfuscated. However, this approach is fundamentally flawed, and such secrets can be exposed, leading to potential security vulnerabilities.
APKscan also helps identify the attack surface of the backend by uncovering forgotten endpoints, test data payloads, and other traces of backend interfaces that developers might have unintentionally exposed. These endpoints can provide attackers with access to sensitive data or functionalities that are not meant for public use. By scanning for such endpoints and test data, APKscan assists in ensuring that the backend is secure and that no unnecessary exposure is left in the deployed applications.
APKscan can help quickly identify sensitive locations in the code, such as SSL pinning libraries, root detection functions, and other security mechanisms. Identifying these functions can speed up reverse engineering and app manipulation by quickly revealing critical points where an app enforces its security policies, making it easier to bypass them with tools like Frida. By pinpointing these areas, APKscan aids in understanding an app's security mechanisms and potential weaknesses.
APKscan allows you to automate the process of scanning for secrets in any number of applications, saving you time and ensuring thorough coverage.
Utilize one or more decompilers and deobfuscators to increase the chances of finding hidden secrets.
- Supports all popular decompilers including
JADX
,APKTool
,CFR
,Procyon
,Krakatau
, andFernflower
, providing flexibility and robustness in your scanning process. - Uses
enjarify-adapter
to convert the Dalvik bytecode in.apk
files into Java bytecode on the fly, so the resulting.jar
can be processed by decompilers/deobfuscators that do not support.apks
directly.
Define your own secret locator rules or use the default ones provided. This flexibility allows you to tailor the scanning process to your specific needs and improve the detection accuracy of sensitive information.
- Support for common formats:
SecretLocator JSON
,secret-patterns-db YAML
,gitleaks TOML
, and simple key-value pairs.
Choose from multiple output formats (
JSON
,YAML
, ortext
) and organize the results by input file or locator. This makes it easier to integrate with other tools and workflows, and to analyze the findings effectively.
Decompile and scan a wide range of Android-related files, including
.apk,
.xapk,
.dex,
.jar,
.class,
.smali,
.zip,
.aar,
.arsc,
.aab,
and.jadx.kts
files.
- NEW:
.xapk
->.apk
(s) unpacking/extraction support added in v0.4.0.
APKscan offers advanced options for concurrency, decompilation, and scanning, enabling you to optimize the performance and behavior of the tool to suit your environment and requirements.
APKscan can be installed from PyPi or from source.
pip3 install apkscan
git clone https://github.com/LucasFaudman/apkscan.git
cd apkscan
python3 -m venv .venv
source .venv/bin/activate
pip3 install -e .
cd ../
The most basic way to use APKscan is to decompile an APK using the default decompiler JADX
and scan using the default Secret locator rules in default.json
.
apkscan file-to-scan.apk
Multiple sets of Secret Locators are included and can be refrenced by name. For example, to scan for only AWS credentials and endpoints:
apkscan file-to-scan.apk -r aws endpoints
A slighly more complex example. This time 3 APKs will be decompiled then scanned using the custom rules at /path/to/custom/rules.json
. The output written to output_file.yaml
in YAML
format, and the results will be grouped by which secret locator was matched. Files generated during decompilation will be removed after scanning.
apkscan -r /path/to/custom/rules.json -o output_file.yaml -f yaml -g locator -c file1.apk file2.apk file3.apk
Or in long form:
apkscan --rules /path/to/custom/rules.json --output output_file.yaml --format yaml --groupby locator --cleanup file1.apk file2.apk file3.apk
usage: apkscan [-h] [-r [SECRET_LOCATOR_FILES ...]] [-o SECRETS_OUTPUT_FILE]
[-f {text,json,yaml}] [-g {file,locator,both}]
[-c | --cleanup | --no-cleanup] [-q] [--jadx [JADX]]
[--apktool [APKTOOL]] [--cfr [CFR]] [--procyon [PROCYON]]
[--krakatau [KRAKATAU]] [--fernflower [FERNFLOWER]]
[--enjarify-choice {auto,never,always}]
[--unpack-xapks | --no-unpack-xapks]
[-d | --deobfuscate | --no-deobfuscate]
[-w DECOMPILER_WORKING_DIR]
[--decompiler-output-suffix DECOMPILER_OUTPUT_SUFFIX]
[--decompiler-extra-args DECOMPILER_EXTRA_ARGS [DECOMPILER_EXTRA_ARGS ...]]
[-dct {thread,process,main}] [-dro {completed,submitted}]
[-dmw DECOMPILER_MAX_WORKERS] [-dcs DECOMPILER_CHUNKSIZE]
[-dto DECOMPILER_TIMEOUT] [-sct {thread,process,main}]
[-sro {completed,submitted}] [-smw SCANNER_MAX_WORKERS]
[-scs SCANNER_CHUNKSIZE] [-sto SCANNER_TIMEOUT]
[FILES_TO_SCAN ...]
APKscan v0.4.0 - Scan for secrets, endpoints, and other sensitive
data after decompiling and deobfuscating Android files. (.apk,
.xapk, .dex, .jar, .class, .smali, .zip, .aar, .arsc, .aab, .jadx.kts)
(c) Lucas Faudman, 2024. License information in LICENSE file. Credits
to the original authors of all dependencies used in this project.
options:
-h, --help show this help message and exit
Input Options:
FILES_TO_SCAN Path(s) to Java files to decompile and scan.
-r [SECRET_LOCATOR_FILES ...], --rules [SECRET_LOCATOR_FILES ...]
Path(s) to secret locator rules/patterns files OR
names of included locator sets. Files can be in
SecretLocator JSON, secret-patterns-db YAML, or
Gitleak TOML formats. Included locator sets:
all_secret_locators, aws, azure, cloud, curated,
default, endpoints, gcp, generic, gitleaks, high-
confidence, key_locators, leakin-regexes,
locator_sort, nuclei-regexes, secret. If not provided,
default rules will be used. See: apkscan/src/apkscan/secret_l
ocators/default.json
Output Options:
-o SECRETS_OUTPUT_FILE, --output SECRETS_OUTPUT_FILE
Output file for secrets found.
-f {text,json,yaml}, --format {text,json,yaml}
Output format for secrets found.
-g {file,locator,both}, --groupby {file,locator,both}
Group secrets by input file or locator. Default is
'both'.
-c, --cleanup, --no-cleanup
Remove decompiled output directories after scanning.
-q, --quiet Suppress output from subprocesses.
Decompiler Choices:
Choose which decompiler(s) to use. Optionally specify path to decompiler
binary. Default is JADX.
--jadx [JADX], -J [JADX]
Use JADX Java decompiler.
--apktool [APKTOOL], -A [APKTOOL]
Use APKTool SMALI disassembler.
--cfr [CFR], -C [CFR]
Use CFR Java decompiler. Requires Enjarify.
--procyon [PROCYON], -P [PROCYON]
Use Procyon Java decompiler. Requires Enjarify.
--krakatau [KRAKATAU], -K [KRAKATAU]
Use Krakatau Java decompiler. Requires Enjarify.
--fernflower [FERNFLOWER], -F [FERNFLOWER]
Use Fernflower Java decompiler. Requires Enjarify.
--enjarify-choice {auto,never,always}, -EC {auto,never,always}
When to use Enjarify. Default is 'auto' which means
use only when needed.
--unpack-xapks, --no-unpack-xapks
Unpack XAPK files into APKs before decompiling.
Default is True.
Decompiler Advanced Options:
Options for Java decompiler.
-d, --deobfuscate, --no-deobfuscate
Deobfuscate file before scanning.
-w DECOMPILER_WORKING_DIR, --decompiler-working-dir DECOMPILER_WORKING_DIR
Working directory where files will be decompiled.
--decompiler-output-suffix DECOMPILER_OUTPUT_SUFFIX
Suffix for decompiled output directory names. Default
is '-decompiled'.
--decompiler-extra-args DECOMPILER_EXTRA_ARGS [DECOMPILER_EXTRA_ARGS ...]
Additional arguments to pass to decompilers in form
quoted whitespace separated '<DECOMPILER_NAME>
<EXTRA_ARGS>...'. For example: --decompiler-extra-args
'jadx --no-debug-info,--no-inline'.
-dct {thread,process,main}, --decompiler-concurrency-type {thread,process,main}
Type of concurrency to use for decompilation. Default
is 'thread'.
-dro {completed,submitted}, --decompiler-results-order {completed,submitted}
Order to process results from decompiler. Default is
'completed'.
-dmw DECOMPILER_MAX_WORKERS, --decompiler-max-workers DECOMPILER_MAX_WORKERS
Maximum number of workers to use for decompilation.
-dcs DECOMPILER_CHUNKSIZE, --decompiler-chunksize DECOMPILER_CHUNKSIZE
Number of files to decompile per thread/process.
-dto DECOMPILER_TIMEOUT, --decompiler-timeout DECOMPILER_TIMEOUT
Timeout for decompilation in seconds.
Secret Scanner Advanced Options:
Options for secret scanner.
-sct {thread,process,main}, --scanner-concurrency-type {thread,process,main}
Type of concurrency to use for scanning. Default is
'process'.
-sro {completed,submitted}, --scanner-results-order {completed,submitted}
Order to process results from scanner. Default is
'completed'.
-smw SCANNER_MAX_WORKERS, --scanner-max-workers SCANNER_MAX_WORKERS
Maximum number of workers to use for scanning.
-scs SCANNER_CHUNKSIZE, --scanner-chunksize SCANNER_CHUNKSIZE
Number of files to scan per thread/process.
-sto SCANNER_TIMEOUT, --scanner-timeout SCANNER_TIMEOUT
Timeout for scanning in seconds.
A Secret Locator is a specific pattern designed to detect sensitive information within files. These locators help automate the identification of secrets, such as API keys, client IDs, passwords, and other sensitive data that may inadvertently be included in codebases. Here’s a breakdown of how a Secret Locator is structured and how to configure them for your scans:
Example Secret Locator for OpenAI API key in JSON:
{
"id": "openai-api-key",
"name": "OpenAI API Key",
"pattern": "sk-\\w{20}T3BlbkFJ\\w{20}",
"secret_group": 0,
"description": "OpenAI API Key",
"confidence": "high",
"severity": "high",
"tags": [
"OpenAI",
"API Key",
"Secret Key",
"AI"
]
},
Field | Required | Description | Used For |
---|---|---|---|
id |
Yes | Unique identifier for the locator. | Grouping output by locator. |
name |
Yes | Display name for the locator. | Printed when found, upcoming features. |
pattern |
Yes | Unique Regex Pattern to search for. | Regex to match when locating secrets. |
secret_group |
No | Regex capturing group number to extract. Defaults to 0 (entire match) | Extracting the secret from a match. |
description |
No | Description of the secret and it's impact if leaked. | Upcoming features (search). |
confidence |
No | How likely a match is to be a true positive. | Upcoming features (search, filter output) |
severity |
No | How severe the risk of leaking this secret is. | Upcoming features (search, filter output) |
tags |
No | Tags/keywords | Upcoming features (search, filter output) |
APKscan supports multiple common formats for secret patterns including:
Format | Filetype(s) | Link to More Patterns | Credit |
---|---|---|---|
SecretLocator |
JSON , YAML |
all_secret_locators.json | @lucasfaudman |
secret-patterns-db |
YAML |
Link to DB | @zricethezav |
gitleaks |
TOML |
Link to Gitleaks | @mazen160 |
Key-value pairs. | JSON , YAML |
Link | @douglascrockford |
NOTE: Multiple files in different formats can be provided at once after the
-r/--rules
arg. Duplicate patterns will be removed. Duplicate IDs will be combined in the output.
Need another format? Feel free to open an Issue, edit def load_secret_locators
in secret_scanner.py
, and/or open a PR.
[
{
"id": "gcp-api-key",
"name": "GCP API Key",
"pattern": "\\b(AIza[0-9A-Za-z\\\\-_]{35})(?:['|\\\"|\\n|\\r|\\s|\\x60|;]|$)",
"secret_group": 1,
"description": "Google Cloud Platform API key",
"confidence": "high",
"severity": null,
"tags": [
"Google",
"Cloud",
"API Key"
]
},
{
"id": "generic-key",
"name": "Generic Key",
"pattern": "(?i)\\b\\w+(?:secret_?)?(?:api_?)?key[\\s=:]+[\\'\"][\\w/\\-:@.]+[\\'\"]",
"secret_group": 0,
"description": null,
"confidence": "low",
"severity": null,
"tags": []
}
]
patterns:
- pattern:
name: AWS API Gateway
regex: "[0-9a-z]+.execute-api.[0-9a-z.-_]+.amazonaws.com"
confidence: low
- pattern:
name: AWS ARN
regex: "arn:aws:[a-z0-9-]+:[a-z]{2}-[a-z]+-[0-9]+:[0-9]+:.+"
confidence: low
- pattern:
name: AWS Client ID
regex: "(A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}"
confidence: low
title = "gitleaks config"
[[rules]]
description = "Alibaba AccessKey ID"
id = "alibaba-access-key-id"
regex = '''(?i)\b((LTAI)(?i)[a-z0-9]{20})(?:['|\"|\n|\r|\s|\x60|;]|$)'''
keywords = [
"ltai",
]
[[rules]]
description = "Alibaba Secret Key"
id = "alibaba-secret-key"
regex = '''(?i)(?:alibaba)(?:[0-9a-z\-_\t .]{0,20})(?:[\s|']|[\s|"]){0,3}(?:=|>|:=|\|\|:|<=|=>|:)(?:'|\"|\s|=|\x60){0,5}([a-z0-9]{30})(?:['|\"|\n|\r|\s|\x60|;]|$)'''
secretGroup = 1
keywords = [
"alibaba",
]
[[rules]]
description = "Asana Client ID"
id = "asana-client-id"
regex = '''(?i)(?:asana)(?:[0-9a-z\-_\t .]{0,20})(?:[\s|']|[\s|"]){0,3}(?:=|>|:=|\|\|:|<=|=>|:)(?:'|\"|\s|=|\x60){0,5}([0-9]{16})(?:['|\"|\n|\r|\s|\x60|;]|$)'''
secretGroup = 1
keywords = [
"asana",
]
{
"OpenAI API Key": "sk-\\w{20}T3BlbkFJ\\w{20}",
"GCP API Key": "\\b(AIza[0-9A-Za-z\\\\-_]{35})(?:['|\\\"|\\n|\\r|\\s|\\x60|;]|$)",
"Generic Key": "(?i)\\b\\w+(?:secret_?)?(?:api_?)?key[\\s=:]+[\\'\"][\\w/\\-:@.]+[\\'\"]"
}
APKscan supports many popular APK and Java decompiler/disassemblers/deobfuscators increasing the chance of successfully finding secrets.
NOTE: APKscan uses
enjarify-adapter
to convert the Dalvik bytecode in.apk
files into Java bytecode on the fly, so the resulting.jar
can be processed by decompilers/deobfuscators that do not support.apks
directly.
Tool | Requires Enjarify | Link to Project | Credit |
---|---|---|---|
JADX |
No | Link | @skylot |
APKTool |
No | Link | @iBotPeaches |
CFR |
Yes | Link | @leibnitz27 |
Procyon |
Yes | Link | @mstrobel |
Krakatau |
Yes | Link | @Storyyeller |
Fernflower |
Yes | Link | @fernflower |
Multiple decompilers can be used at once by providing the arguments below. Each optionally accepts a path to the binary of the tool. When no path is provided the binary on the standard path is used. (output of which jadx
, which apktool
, etc)
Decompiler Choices:
Choose which decompiler(s) to use. Optionally specify path to decompiler
binary. Default is JADX.
--jadx [JADX], -J [JADX]
Use JADX Java decompiler.
--apktool [APKTOOL], -A [APKTOOL]
Use APKTool SMALI disassembler.
--cfr [CFR], -C [CFR]
Use CFR Java decompiler. Requires Enjarify.
--procyon [PROCYON], -P [PROCYON]
Use Procyon Java decompiler. Requires Enjarify.
--krakatau [KRAKATAU], -K [KRAKATAU]
Use Krakatau Java decompiler. Requires Enjarify.
--fernflower [FERNFLOWER], -F [FERNFLOWER]
Use Fernflower Java decompiler. Requires Enjarify.
--enjarify-choice {auto,never,always}, -EC {auto,never,always}
When to use Enjarify. Default is 'auto' which means
use only when needed.
--unpack-xapks, --no-unpack-xapks
Unpack XAPK files into APKs before decompiling.
Default is True.
Decompiler Advanced Options:
Options for Java decompiler.
-d, --deobfuscate, --no-deobfuscate
Deobfuscate file before scanning.
-w DECOMPILER_WORKING_DIR, --decompiler-working-dir DECOMPILER_WORKING_DIR
Working directory where files will be decompiled.
--decompiler-output-suffix DECOMPILER_OUTPUT_SUFFIX
Suffix for decompiled output directory names. Default is '-decompiled'.
--decompiler-extra-args DECOMPILER_EXTRA_ARGS [DECOMPILER_EXTRA_ARGS ...]
Additional arguments to pass to decompilers in form quoted whitespace separated '<DECOMPILER_NAME>
<EXTRA_ARGS>...'. For example: --decompiler-extra-args 'jadx --no-debug-info,--no-inline'.
Examples:
Decompile with both JADX
and APKtool
:
apkscan --jadx --apktool -o "combined-output.json" app-to-scan.apk
Decompile with JADX
located at "/non/standard/path/jadx"
, Procyon
and CFR
binaries in the standard location:
apkscan --jadx "/non/standard/path/jadx" --cfr --procyon -o "combined-output.json" app-to-scan.apk
Decompile multiple APKs with all decompilers and output YAML
:
apkscan -J -A -C -P -K -F -o "combined.yaml' -f yaml app-to-scan1.apk app-to-scan2.apk app-to-scan3.xapk
Provide extra args to JADX
and CFR
:
apkscan --jadx --cfr --decompiler-extra-args "jadx --add-debug-lines --no-inline-anonymous" "cfr --renamedupmembers true" app-to-scan.apk
Using multiple decompilers increases the chance of successfully finding secrets for several reasons:
Different decompilers use various algorithms and heuristics to reverse-engineer bytecode back into source code. By leveraging multiple decompilers, you can capture a broader spectrum of decompilation strategies, increasing the likelihood of accurately reconstructing the original source code.
APKs often use obfuscation techniques to make reverse engineering more difficult. Some decompilers are better at handling specific types of obfuscation than others. By using multiple decompilers, you can overcome a wider range of obfuscation techniques, ensuring more thorough analysis.
No single decompiler can guarantee perfect output for all APKs. Some decompilers might miss certain parts of the code or fail to decompile specific constructs correctly. Combining the outputs from multiple decompilers helps ensure a more complete and accurate reconstruction of the application.
Having multiple decompiled versions of the same APK allows for cross-verification. Discrepancies between the outputs can be analyzed to identify potential decompilation errors or areas that need further investigation.
APKscan offers a comprehensive set of concurrency and performance options that are configurable in a similar way for both decompilation and secret scanning processes. These options allow you to optimize the speed and efficiency of APKscan based on your system's capabilities and the size of your workload.
Both the decompilation AND secret scanning processes can be configured using the following options:
Decompiler Advanced Options:
Options for Java decompiler.
<truncated>
-dct {thread,process,main}, --decompiler-concurrency-type {thread,process,main}
Type of concurrency to use for decompilation. Default is 'thread'.
-dro {completed,submitted}, --decompiler-results-order {completed,submitted}
Order to process results from decompiler. Default is 'completed'.
-dmw DECOMPILER_MAX_WORKERS, --decompiler-max-workers DECOMPILER_MAX_WORKERS
Maximum number of workers to use for decompilation.
-dcs DECOMPILER_CHUNKSIZE, --decompiler-chunksize DECOMPILER_CHUNKSIZE
Number of files to decompile per thread/process.
-dto DECOMPILER_TIMEOUT, --decompiler-timeout DECOMPILER_TIMEOUT
Timeout for decompilation in seconds.
Secret Scanner Advanced Options:
Options for secret scanner.
-sct {thread,process,main}, --scanner-concurrency-type {thread,process,main}
Type of concurrency to use for scanning. Default is 'process'.
-sro {completed,submitted}, --scanner-results-order {completed,submitted}
Order to process results from scanner. Default is 'completed'.
-smw SCANNER_MAX_WORKERS, --scanner-max-workers SCANNER_MAX_WORKERS
Maximum number of workers to use for scanning.
-scs SCANNER_CHUNKSIZE, --scanner-chunksize SCANNER_CHUNKSIZE
Number of files to scan per thread/process.
-sto SCANNER_TIMEOUT, --scanner-timeout SCANNER_TIMEOUT
Timeout for scanning in seconds.
Specify the type of concurrency to use with
{thread, process, main}
.
thread
: Uses threading, suitable for I/O-bound tasks.process
: Uses multiprocessing, more efficient for CPU-bound tasks.main
: Runs in the main thread, useful for debugging or environments where concurrency is restricted.
Control the order in which results are processed with
{completed, submitted}
.
completed
: Processes results as soon as they are completed.submitted
: Processes results in the order they were submitted.
Set the maximum number of workers (threads or processes) to use.
- Adjust based on your system's CPU and memory resources.
Define the number of files to submit for processing process.
- This helps balance the workload and can improve performance.
Set a timeout for each thread/process in seconds.
- This ensures that stalled tasks do not indefinitely block the overall process.
To optimize the performance of APKscan, consider the following tips:
- Decompilation is memory-intensive: Set the maximum number of decompiler workers based on your available RAM to avoid system slowdowns or crashes.
- Balance workload between decompilation and scanning: Consider whether your workload is more focused on decompiling or scanning.
- If using a large number of decompilers and scanning for few secret locators, allocate more workers to decompilation and fewer to scanning.
- Conversely, if using fewer decompilers but scanning for many secret locators, allocate more workers to the scanning process.
By fine-tuning these concurrency and performance options, you can make the most of APKscan's capabilities, ensuring efficient and effective secret detection across large and diverse sets of files.
Contributions welcome! Whether you're interested in fixing bugs, adding new features, improving documentation, or sharing ideas, any input is valuable.
-
Fork the Repository: Start by forking the repository on GitHub. This will create a copy of the project in your own GitHub account.
-
Clone the Repository: Clone the forked repository to your local machine.
git clone https://github.com/LucasFaudman/apkscan.git cd apkscan
-
Create a Branch: Create a new branch for your changes.
git checkout -b my-feature-branch
-
Make Changes: Make your changes in the code, documentation, or both.
-
Commit Changes: Commit your changes with a descriptive commit message.
git add . git commit -m "Description of the changes"
-
Push Changes: Push your changes to your forked repository.
git push origin my-feature-branch
-
Create a Pull Request: Go to the original repository and create a pull request from your branch. Provide a detailed description of your changes and any relevant information.
If you encounter any bugs, have suggestions, or need help, please open an issue on GitHub. Make sure to provide as much detail as possible, including steps to reproduce the issue, error messages, and screenshots if applicable.
See LICENSE for details.