This repository contains code to run a web scraper to consolidate NYC Public School Allocations and Budget Summaries from the Galaxy, the website that NYC Department of Education (DOE) Division of Finance publishes publicly accessible data on public schools in New York City.
-
Galaxy Allocations. NYC Department of Education Division of Finance. 2018-2024. https://www.nycenet.edu/offices/d_chanc_oper/budget/dbor/galaxy/galaxyallocation/default.aspx
-
Galaxy Budget Summaries. NYC Department of Education Division of Finance. 2018-2024. https://www.nycenet.edu/offices/d_chanc_oper/budget/dbor/galaxy/galaxybudgetsummaryto/default.aspx
This web scraper is run on a Python Jupyter Notebook with the following library/module dependencies:
- Beautiful Soup
- pandas
- requests
- os (operating system)
- selenium
- numpy
- glob (unix style pathname pattern expansion)
- time (time access and conversions)
- undetected_chromedriver
Please reference this guide for documentation on how to work with the web scrapers in this repository: nyc-school-budgets-scrapers Web Scraper Documentation (Google document)
BetaNYC Civic Innovation Lab Team: Ashley Louie (Director), Erik Brown, Zhi Keng He, Audrey Leung, Vaishali Talwar