A python webscraper that scrapes AO3 for fanfiction data, stores it in a database, and highlights entries when they are updated.
Table with an updated entry highlighted.
You can easily install the latest version from pip:
pip3 install ao3scraper
Create a python virtual environment with python3 -m venv dev_venv
and activate it.
Then, install required packages with:
poetry install
This will also install ao3scraper into the virtual environment.
Usage: ao3scraper [OPTIONS]
Options:
-s, --scrape Launches scraping mode.
-c, --cache Prints the last scraped table.
-l, --list Lists all entries in the database.
-a, --add TEXT Adds a single url to the database.
--add-urls Opens a text file to add multiple urls to the database.
-d, --delete INTEGER Deletes an entry from the database.
-v, --version Display version of ao3scraper and other info.
--help Show this message and exit.
ao3scraper is ridiculously customisable, and most aspects of the program can be modified from here.
To find the configuration file location, run python3 ao3scraper -v
.
ao3scraper uses rich's styling. To disable any styling options, replace the styling value with 'none'.
Fics have many attributes that are not displayed by default. To add these columns, create a new option under table_template, like so:
table_template:
- column: characters # The specified attribute
name: Characters :) # This is what the column will be labelled as
styles: none # Rich styling
A complete list of attributes can be found on the wiki.
If you're updating from a legacy version of ao3scraper (before 1.0.0), move fics.db
to the data location.
This can be found by running python3 ao3scraper -v
.
The migration wizard will then prompt you to upgrade your database.
If you accept, a backup of the current fics.db
will be created in /backups
, and migration will proceed.
Contributions are always appreciated. Submit a pull request with your suggested changes!
ao3scraper would not be possible without the existence of ao3_api and the work of its contributors.