-
-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New html parser #888
Comments
IMO html parser shouldn't be the performance bottleneck unless you can provide some proof. |
At present, PDM is reusing the ability of |
Do you think dropping pip in pdm will be earlier than pip will fully remove html5lib from vendors? |
https://peps.python.org/pep-0691/ Maybe don’t need new html parser |
But existing api would not be deprecated soon |
On PDM 2.0 we switched from pip to unearth, which uses |
Please test it on |
Is your feature request related to a problem? Please describe.
As I mentioned earlier, html5lib will be removed from pip (it already does not switch to html5lib by default). So maybe rewrite to
html.parser
or add 3th-party lib parser, like html5lib.Or use a faster parser, like selectolax to improve performance.
Describe the solution you'd like
I think, html.parser is slow, as html5lib too. So adding selectolax can be a solution.
The text was updated successfully, but these errors were encountered: