Skip to content

Commit

Permalink
Fail less, parse more links
Browse files Browse the repository at this point in the history
  • Loading branch information
Adam Baxter committed Apr 27, 2022
1 parent 22a182e commit 5475d1b
Show file tree
Hide file tree
Showing 4 changed files with 504 additions and 481 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# tplink-grab
Downloads all GPL tarballs from TP-Link by parsing `https://www.tp-link.com/au/choose-your-location/`, then extracting country-specific `support/gpl-code/` pages to get lists of tarballs
Downloads all GPL tarballs (and zips and rars!) from TP-Link by parsing `https://www.tp-link.com/au/choose-your-location/`, then extracting country-specific `support/gpl-code/` pages to get lists of tarballs
The pages are structured in such a way that they'll either have direct links to `tar.gz` files or similar, or Javascript generates links to a page like `https://www.tp-link.com/phppage/gpl-res-list.html?model=Deco%20M5&appPath=kz` for each model and country code


Expand Down Expand Up @@ -40,6 +40,7 @@ Future plans to dump this into SQLite and compress with https://github.com/phire
* output/{sha256sum of url} - WARC file used as cache

## TODO
* Reduce amount of log spam - use https://gist.github.com/bdarnell/3118509 or similar

* second_pass: grab all links to tarballs, deduplicate, write metadata to sqlite (HEAD requests?), compare with already downloaded tarballs (?)

Expand Down
Loading

0 comments on commit 5475d1b

Please sign in to comment.