scraper-reality

Description

It's task for the interview with task: Use scrapy framework to scrape the first 500 items (title, image url) from sreality.cz (flats, sell) and save it in the Postgresql database. Implement a simple HTTP server in python and show these 500 items on a simple page (title and image) and put everything to single docker-compose command so that I can just run "docker-compose up" in the Github repository and see the scraped ads on http://127.0.0.1:8080 page

It took 12 hours to make it.

Another ~4 hours took to implement NoSQL database to the project, which you can see in pull requests. Live NoSQL version should run on: https://frontend-scraper.azurewebsites.net

How to use

App can be started with commands: docker compose build && docker-compose up
If you want to develop or try checks (Linux): python3 -m venv venv && source venv/bin/activate && make dev-build
Check Makefile for possible checks

If I had more time, I'd finish this list of things:

Unit tests and E2E tests
Improve exception handling.
API checking using schema packages
Data re-reading (possibility to clean DB)
Frontend improvement (video sign, whole page reformatting)
Mask Postgres password for production
Add variable count of estates per page / paging

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
build		build
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
devel-requirements.in		devel-requirements.in
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scraper-reality

Description

How to use

About

Releases

Packages

Languages

simonf-dev/scraper-reality

Folders and files

Latest commit

History

Repository files navigation

scraper-reality

Description

How to use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages