pyedgar

CLI for scraping data from the U.S. Securities and Exchange Commision's EDGAR database.

Requirements

Python 2.7
pip

Download

git clone https://github.com/djfdev/pyedgar.git

Or, simply export a zipfile from the the Github repository and unpack it somewhere on your machine.

Setup

This tool uses a number of 3rd-party libraries, which you'll need to install to get up and running.

pip install -r requirements.txt

If you have permission denied, you may have to install with:

sudo pip install -r requirements.txt

CLI usage

The CLI accepts arguments to specify the paths of (1) input file, and (1) output file. These are both CSV files. So if you have your input file located inside of the pyedgar folder, you might type:

python cli.py --input="./input.csv" --output="./output.csv"

Input format

The input file should be a CSV with the following format:

ticker,event_date,filing_type,search_term
AAPL,20140228,8-K,5.07|items 5

Note the search_term column: this column accepts a pipe-delimited list of strings that the scraper will attempt to match a filings description against. If the term cannot be found as a string, the program will retry it as a regular expression. This allows you to do really powerful and flexible pattern matching against the search results.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lib.py		lib.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pyedgar

Requirements

Download

Setup

CLI usage

Input format

License

About

Releases

Packages

Languages

License

djfdev/pyedgar

Folders and files

Latest commit

History

Repository files navigation

pyedgar

Requirements

Download

Setup

CLI usage

Input format

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages