Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readme Updates #687

Merged
merged 11 commits into from
Nov 6, 2021
267 changes: 145 additions & 122 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,14 +77,6 @@
- [proxy.TestCase](#proxytestcase)
- [Override Startup Flags](#override-startup-flags)
- [With unittest.TestCase](#with-unittesttestcase)
- [Plugin Developer and Contributor Guide](#plugin-developer-and-contributor-guide)
- [High level architecture](#high-level-architecture)
- [Everything is a plugin](#everything-is-a-plugin)
- [Internal Documentation](#internal-documentation)
- [Development Guide](#development-guide)
- [Setup Local Environment](#setup-local-environment)
- [Setup Git Hooks](#setup-git-hooks)
- [Sending a Pull Request](#sending-a-pull-request)
- [Utilities](#utilities)
- [TCP](#tcp-sockets)
- [new_socket_connection](#new_socket_connection)
Expand All @@ -97,6 +89,7 @@
- [CLI Usage](#cli-usage)
- [Run Dashboard](#run-dashboard)
- [Inspect Traffic](#inspect-traffic)
- [Chrome DevTools Protocol](#chrome-devtools-protocol)
- [Frequently Asked Questions](#frequently-asked-questions)
- [Threads vs Threadless](#threads-vs-threadless)
- [SyntaxError: invalid syntax](#syntaxerror-invalid-syntax)
Expand All @@ -106,6 +99,14 @@
- [Docker image not working on MacOS](#docker-image-not-working-on-macos)
- [ValueError: filedescriptor out of range in select](#valueerror-filedescriptor-out-of-range-in-select)
- [None:None in access logs](#nonenone-in-access-logs)
- [Plugin Developer and Contributor Guide](#plugin-developer-and-contributor-guide)
- [High level architecture](#high-level-architecture)
- [Everything is a plugin](#everything-is-a-plugin)
- [Internal Documentation](#internal-documentation)
- [Development Guide](#development-guide)
- [Setup Local Environment](#setup-local-environment)
- [Setup Git Hooks](#setup-git-hooks)
- [Sending a Pull Request](#sending-a-pull-request)
- [Flags](#flags)
- [Changelog](#changelog)
- [v2.x](#v2x)
Expand Down Expand Up @@ -507,7 +508,7 @@ Along with the proxy request log, you must also see a http web server request lo
### FilterByUpstreamHostPlugin

Drops traffic by inspecting upstream host.
By default, plugin drops traffic for `google.com` and `www.google.com`.
By default, plugin drops traffic for `facebook.com` and `www.facebok.com`.

Start `proxy.py` as:

Expand All @@ -516,7 +517,7 @@ Start `proxy.py` as:
--plugins proxy.plugin.FilterByUpstreamHostPlugin
```

Verify using `curl -v -x localhost:8899 http://google.com`:
Verify using `curl -v -x localhost:8899 http://facebook.com`:

```bash
... [redacted] ...
Expand Down Expand Up @@ -726,15 +727,22 @@ Modify `ModifyChunkResponsePlugin` to your taste. Example, instead of sending ha

This plugin uses `Cloudflare` hosted `DNS-over-HTTPS` [API](https://developers.cloudflare.com/1.1.1.1/encrypted-dns/dns-over-https/make-api-requests/dns-json) (json).

Start `proxy.py` as:
`DoH` mandates a HTTP2 compliant client. Unfortunately `proxy.py`
doesn't provide that yet, so we use a dependency. Install it:

```bash
abhinavsingh marked this conversation as resolved.
Show resolved Hide resolved
❯ pip install "httpx[http2]"
```

Now start `proxy.py` as:

```bash
❯ proxy \
--plugins proxy.plugin.CloudflareDnsResolverPlugin
```

By default, `CloudflareDnsResolverPlugin` runs in `security` mode (provides malware protection). Use `--cloudflare-dns-mode family` to also enable
adult content protection.
By default, `CloudflareDnsResolverPlugin` runs in `security` mode and provides malware protection.
Use `--cloudflare-dns-mode family` to also enable adult content protection too.

### CustomDnsResolverPlugin

Expand Down Expand Up @@ -815,19 +823,18 @@ on the command line.
Plugins are called in the same order as they are passed. Example,
say we are using both `FilterByUpstreamHostPlugin` and
`RedirectToCustomServerPlugin`. Idea is to drop all incoming `http`
requests for `google.com` and `www.google.com` and redirect other
requests for `facebook.com` and `www.facebook.com` and redirect other
`http` requests to our inbuilt web server.

Hence, in this scenario it is important to use
`FilterByUpstreamHostPlugin` before `RedirectToCustomServerPlugin`.
If we enable `RedirectToCustomServerPlugin` before `FilterByUpstreamHostPlugin`,
`google` requests will also get redirected to inbuilt web server,
`facebook` requests will also get redirected to inbuilt web server,
instead of being dropped.

# End-to-End Encryption

By default, `proxy.py` uses `http` protocol for communication with clients e.g. `curl`, `browser`.
For enabling end-to-end encrypting using `tls` / `https` first generate certificates:
By default, `proxy.py` uses `http` protocol for communication with clients e.g. `curl`, `browser`. For enabling end-to-end encrypting using `tls` / `https` first generate certificates. **Checkout** the repository and run:

```bash
make https-certificates
Expand Down Expand Up @@ -1177,7 +1184,7 @@ Note that:
## Non-blocking Mode

Start `proxy.py` in non-blocking embedded mode with default configuration
by using `start` method: Example:
by using `Proxy` context manager: Example:

```python
import proxy
Expand All @@ -1190,10 +1197,10 @@ if __name__ == '__main__':
Note that:

1. `Proxy` is similar to `main`, except `Proxy` does not block.
1. Internally `Proxy` is a context manager.
It will start `proxy.py` when called and will shut it down
2. Internally `Proxy` is a context manager.
3. It will start `proxy.py` when called and will shut it down
once the scope ends.
1. Just like `main`, startup flags with `Proxy`
4. Just like `main`, startup flags with `Proxy`
can be customized by either passing flags as list of
input arguments e.g. `Proxy(['--port', '8899'])` or
by using passing flags as kwargs e.g. `Proxy(port=8899)`.
Expand All @@ -1216,8 +1223,15 @@ if __name__ == '__main__':

## Loading Plugins

You can, of course, list plugins to load in the input arguments list of `proxy.main` or
`Proxy` constructor. Use the `--plugins` flag when starting from command line:
Users can use `--plugins` flag multiple times to load multiple plugins.
See [Unable to load plugins](#unable-to-load-plugins) if you are running into issues.

When using in embedded mode, you have a few more options. Example:

1. Provide a fully-qualified name of the plugin class as `bytes` to the `proxy.main` method or `proxy.Proxy` context manager.
2. Provide `type` instance of the plugin class. This is especially useful if you plan to define plugins at runtime.

Example, load a single plugin using `--plugins` flag:

```python
import proxy
Expand All @@ -1228,8 +1242,9 @@ if __name__ == '__main__':
])
```

For simplicity you can pass the list of plugins to load as a keyword argument to `proxy.main` or
the `Proxy` constructor:
For simplicity, you can also pass the list of plugins as a keyword argument to `proxy.main` or the `Proxy` constructor.

Example:

```python
import proxy
Expand All @@ -1242,11 +1257,6 @@ if __name__ == '__main__':
])
```

Note that it supports:

1. The fully-qualified name of a class as `bytes`
2. Any `type` instance of a plugin class. This is especially useful for plugins defined at runtime

# Unit testing with proxy.py

## proxy.TestCase
Expand Down Expand Up @@ -1321,98 +1331,6 @@ class TestProxyPyEmbedded(unittest.TestCase):
or simply setup / teardown `proxy.py` within
`setUpClass` and `teardownClass` class methods.

# Plugin Developer and Contributor Guide

## High level architecture

```bash
+-------------+
| Proxy([]) |
+------+------+
|
|
+-----------v--------------+
| AcceptorPool(...) |
+------------+-------------+
|
|
+-----------------+ | +-----------------+
| Acceptor(..) <-------------+-----------> Acceptor(..) |
+-----------------+ +-----------------+
```

`proxy.py` is made with performance in mind. By default, `proxy.py`
will try to utilize all available CPU cores to it for accepting new
client connections. This is achieved by starting `AcceptorPool` which
listens on configured server port. Then, `AcceptorPool` starts `Acceptor`
processes (`--num-workers`) to accept incoming client connections.

Each `Acceptor` process delegates the accepted client connection
to a `Work` class. Currently, `HttpProtocolHandler` is the default
work klass hardcoded into the code.

`HttpProtocolHandler` simply assumes that incoming clients will follow
HTTP specification. Specific HTTP proxy and HTTP server implementations
are written as plugins of `HttpProtocolHandler`.

See documentation of `HttpProtocolHandlerPlugin` for available lifecycle hooks.
Use `HttpProtocolHandlerPlugin` to add new features for http(s) clients. Example,
See `HttpWebServerPlugin`.

## Everything is a plugin

Within `proxy.py` everything is a plugin.

- We enabled `proxy server` plugins using `--plugins` flag.
Proxy server `HttpProxyPlugin` is a plugin of `HttpProtocolHandler`.
Further, Proxy server allows plugin through `HttpProxyBasePlugin` specification.

- All the proxy server [plugin examples](#plugin-examples) were implementing
`HttpProxyBasePlugin`. See documentation of `HttpProxyBasePlugin` for available
lifecycle hooks. Use `HttpProxyBasePlugin` to modify behavior of http(s) proxy protocol
between client and upstream server. Example,
[FilterByUpstreamHostPlugin](#filterbyupstreamhostplugin).

- We also enabled inbuilt `web server` using `--enable-web-server`.
Web server `HttpWebServerPlugin` is a plugin of `HttpProtocolHandler`
and implements `HttpProtocolHandlerPlugin` specification.

- There also is a `--disable-http-proxy` flag. It disables inbuilt proxy server.
Use this flag with `--enable-web-server` flag to run `proxy.py` as a programmable
http(s) server.

## Development Guide

### Setup Local Environment

Contributors must start `proxy.py` from source to verify and develop new features / fixes.

See [Run proxy.py from command line using repo source](#from-command-line-using-repo-source) for details.


[![WARNING](https://img.shields.io/static/v1?label=MacOS&message=warning&color=red)](https://github.com/abhinavsingh/proxy.py/issues/642#issuecomment-960819271) On `macOS`
you must install `Python` using `pyenv`, as `Python` installed via `homebrew` tends
to be problematic. See linked thread for more details.

### Setup Git Hooks

Pre-commit hook ensures tests are passing.

1. `cd /path/to/proxy.py`
2. `ln -s $(PWD)/git-pre-commit .git/hooks/pre-commit`

Pre-push hook ensures lint and tests are passing.

1. `cd /path/to/proxy.py`
2. `ln -s $(PWD)/git-pre-push .git/hooks/pre-push`

### Sending a Pull Request

Every pull request is tested using GitHub actions.

See [GitHub workflow](https://github.com/abhinavsingh/proxy.py/tree/develop/.github/workflows)
for list of tests.

# Utilities

## TCP Sockets
Expand Down Expand Up @@ -1611,6 +1529,8 @@ FILE

# Run Dashboard

**This is a WIP and may not work as documented**

Dashboard is currently under development and not yet bundled with `pip` packages.
To run dashboard, you must checkout the source.

Expand Down Expand Up @@ -1652,6 +1572,17 @@ the websocket connection that dashboard established with the `proxy.py` server.

[![Proxy.Py Dashboard Inspect Traffic](https://raw.githubusercontent.com/abhinavsingh/proxy.py/develop/Dashboard.png)](https://github.com/abhinavsingh/proxy.py)

# Chrome DevTools Protocol

For scenarios where you want direct access to `Chrome DevTools` protocol websocket endpoint,
start `proxy.py` as:

```bash
$ proxy --enable-devtools --enable-events
```

Now point your CDT instance to `ws://localhost:8899/devtools`.

# Frequently Asked Questions

## Threads vs Threadless
Expand Down Expand Up @@ -1781,6 +1712,98 @@ few obvious ones include:
1. Client established a connection but never completed the request.
2. A plugin returned a response prematurely, avoiding connection to upstream server.

# Plugin Developer and Contributor Guide

## High level architecture

```bash
+-------------+
| Proxy([]) |
+------+------+
|
|
+-----------v--------------+
| AcceptorPool(...) |
+------------+-------------+
|
|
+-----------------+ | +-----------------+
| Acceptor(..) <-------------+-----------> Acceptor(..) |
+-----------------+ +-----------------+
```

`proxy.py` is made with performance in mind. By default, `proxy.py`
will try to utilize all available CPU cores to it for accepting new
client connections. This is achieved by starting `AcceptorPool` which
listens on configured server port. Then, `AcceptorPool` starts `Acceptor`
processes (`--num-workers`) to accept incoming client connections.

Each `Acceptor` process delegates the accepted client connection
to a `Work` class. Currently, `HttpProtocolHandler` is the default
work klass hardcoded into the code.

`HttpProtocolHandler` simply assumes that incoming clients will follow
HTTP specification. Specific HTTP proxy and HTTP server implementations
are written as plugins of `HttpProtocolHandler`.

See documentation of `HttpProtocolHandlerPlugin` for available lifecycle hooks.
Use `HttpProtocolHandlerPlugin` to add new features for http(s) clients. Example,
See `HttpWebServerPlugin`.

## Everything is a plugin

Within `proxy.py` everything is a plugin.

- We enabled `proxy server` plugins using `--plugins` flag.
Proxy server `HttpProxyPlugin` is a plugin of `HttpProtocolHandler`.
Further, Proxy server allows plugin through `HttpProxyBasePlugin` specification.

- All the proxy server [plugin examples](#plugin-examples) were implementing
`HttpProxyBasePlugin`. See documentation of `HttpProxyBasePlugin` for available
lifecycle hooks. Use `HttpProxyBasePlugin` to modify behavior of http(s) proxy protocol
between client and upstream server. Example,
[FilterByUpstreamHostPlugin](#filterbyupstreamhostplugin).

- We also enabled inbuilt `web server` using `--enable-web-server`.
Web server `HttpWebServerPlugin` is a plugin of `HttpProtocolHandler`
and implements `HttpProtocolHandlerPlugin` specification.

- There also is a `--disable-http-proxy` flag. It disables inbuilt proxy server.
Use this flag with `--enable-web-server` flag to run `proxy.py` as a programmable
http(s) server.

## Development Guide

### Setup Local Environment

Contributors must start `proxy.py` from source to verify and develop new features / fixes.

See [Run proxy.py from command line using repo source](#from-command-line-using-repo-source) for details.


[![WARNING](https://img.shields.io/static/v1?label=MacOS&message=warning&color=red)](https://github.com/abhinavsingh/proxy.py/issues/642#issuecomment-960819271) On `macOS`
you must install `Python` using `pyenv`, as `Python` installed via `homebrew` tends
to be problematic. See linked thread for more details.

### Setup Git Hooks

Pre-commit hook ensures tests are passing.

1. `cd /path/to/proxy.py`
2. `ln -s $(PWD)/git-pre-commit .git/hooks/pre-commit`

Pre-push hook ensures lint and tests are passing.

1. `cd /path/to/proxy.py`
2. `ln -s $(PWD)/git-pre-push .git/hooks/pre-push`

### Sending a Pull Request

Every pull request is tested using GitHub actions.

See [GitHub workflow](https://github.com/abhinavsingh/proxy.py/tree/develop/.github/workflows)
for list of tests.

# Flags

```bash
Expand Down