Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraper is broken "How to Convert Python HTML to Jupyter NB" references a missing tag... https://github.com/marsja/jupyter/blob/master/convert_html_jupyter_notebook_tutorial.ipynb #2

Open
richlysakowski opened this issue May 22, 2021 · 1 comment

Comments

@richlysakowski
Copy link

The example on the website below does not work, because the article calls out a tag "post-content" that does not exist in the WordPress page.

The example does not scrape anything because "post-content" is missing from the Response object. The result of the "get_data()" function is a None object.

https://www.marsja.se/converting-html-to-a-jupyter-notebook/
https://github.com/marsja/jupyter/blob/master/convert_html_jupyter_notebook_tutorial.ipynb

The article needs to be updated so that the scraper actually works.

Here's where the article references the tag to parse.

5. Getting the Code Elements from the HTML

In the last step, we are creating a Python function called get_code. This function will take two arguments. First, the beautifulsoup object, we earlier created, and the content_class to search for content in. In the case, of this particular WordPress, blog this will be "post-content"


@celiomarcos
Copy link

Hi, I think this will be fixed after this pull request that I start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants