proxy_py is a program which collects proxies, saves them in a database and makes periodically checks. It has a server for getting proxies with nice API(see below).
It's here -> https://proxy-py.readthedocs.io
You can donate here -> https://www.patreon.com/join/2313433
Thank you :)
There is a prepared docker image.
1 Install docker and docker compose. If you're using ubuntu:
sudo apt install docker.io docker-compose
2 Download docker compose config:
wget "https://raw.githubusercontent.com/DevAlone/proxy_py/master/docker-compose.yml"
2 Create a container
docker-compose build
3 Run
docker-compose up
It will give you a server on address localhost:55555
To see running containers use
docker-compose ps
To stop proxy_py use
docker-compose stop
proxy_py has a server, based on aiohttp, which is listening 127.0.0.1:55555 (you can change it in the settings file) and provides proxies. To get proxies you should send the following json request on address http://127.0.0.1:55555/api/v1/ (or other domain if behind reverse proxy):
{
"model": "proxy",
"method": "get",
"order_by": "response_time, uptime"
}
Note: order_by makes the result sorted by one or more fields(separated by comma). You can skip it. The required fields are model and method.
It's gonna return you the json response like this:
{
"count": 1,
"data": [{
"address": "http://127.0.0.1:8080",
"auth_data": "",
"bad_proxy": false,
"domain": "127.0.0.1",
"last_check_time": 1509466165,
"number_of_bad_checks": 0,
"port": 8080,
"protocol": "http",
"response_time": 461691,
"uptime": 1509460949
}],
"has_more": false,
"status": "ok",
"status_code": 200
}
Note: All fields except protocol, domain, port, auth_data, checking_period and address CAN be null
Or error if something went wrong:
{
"error_message": "You should specify \"model\"",
"status": "error",
"status_code": 400
}
Note: status_code is also duplicated in HTTP status code
Example using curl:
curl -X POST http://127.0.0.1:55555/api/v1/ -H "Content-Type: application/json" --data '{"model": "proxy", "method": "get"}'
Example using httpie:
http POST http://127.0.0.1:55555/api/v1/ model=proxy method=get
Example using python's requests library:
import requests
import json
def get_proxies():
result = []
json_data = {
"model": "proxy",
"method": "get",
}
url = "http://127.0.0.1:55555/api/v1/"
response = requests.post(url, json=json_data)
if response.status_code == 200:
response = json.loads(response.text)
for proxy in response["data"]:
result.append(proxy["address"])
else:
# check error here
pass
return result
Example using aiohttp library:
import aiohttp
async def get_proxies():
result = []
json_data = {
"model": "proxy",
"method": "get",
}
url = "http://127.0.0.1:55555/api/v1/"
async with aiohttp.ClientSession() as session:
async with session.post(url, json=json_data) as response:
if response.status == 200:
response = json.loads(await response.text())
for proxy in response["data"]:
result.append(proxy["address"])
else:
# check error here
pass
return result
Read more about API here -> https://proxy-py.readthedocs.io/en/latest/api_v1_overview.html
# TODO: add readme about API v2
There is lib.ru inspired web interface which consists of these pages(with slash at the end):
- http://localhost:55555/i/get/proxy/
- http://localhost:55555/i/get/proxy_count_item/
- http://localhost:55555/i/get/number_of_proxies_to_process/
- http://localhost:55555/i/get/collector_state/
Just fork, make your changes(implement new collector, fix a bug or whatever you want) and create pull request.
Here are some useful guides:
If you've made changes to the code and want to check that you didn't break anything, just run
py.test
inside virtual environment in proxy_py project directory.
If you wan't to collect proxies from your source or you need proxies to work with particular site, you can write your own collectors or/and checkers.
- Create your checkers/collectors in current directory following the next directory structure:
// TOOD: add more detailed readme about it
local/ ├── requirements.txt ├── checkers │ └── custom_checker.py └── collectors └── custom_collector.py
You can create only checker or collector if you want so
- Create proxy_py/settings.py in current dir with the following content
from ._settings import *
from local.checkers.custom_checker import CustomChecker
PROXY_CHECKERS = [CustomChecker]
COLLECTORS_DIRS = ['local/collectors']
you can append your checker to PROXY_CHECKERS or COLLECTORS_DIRS instead of overriding to use built in ones as well, it's just normal python file. See proxy_py/_settings.py for more detailed instructions on options.
- Follow the steps in "How to install?" but download this docker-compose config instead
wget "https://raw.githubusercontent.com/DevAlone/proxy_py/master/docker-compose-with-local.yml"
and run with command
docker-compose -f docker-compose-with-local.yml up
- ...?
- Profit!
- Clone this repository
git clone https://github.com/DevAlone/proxy_py.git
- Install requirements
cd proxy_py
pip3 install -r requirements.txt
- Create settings file
cp config_examples/settings.py proxy_py/settings.py
- Install postgresql and change database configuration in settings.py file
- (Optional) Configure alembic
- Run your application
python3 main.py
- Enjoy!