-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow umlaut domains for website addresses #952
Conversation
Thank you for this PR. In my eyes, we should avoid introducing new dependencies. The relevant part of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add a test for the use case you're trying to address
isso/views/comments.py
Outdated
return __url_re.match(text) is not None | ||
text = normalize(text) | ||
# urlparse does not like port numbers in URLs | ||
text = re.sub(r':\d+', '', text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems odd to me - urlparse handles port numbers in URLs fine, so there must be something else going on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, the reason for removing the port was not urlparse, it is validators.domain()
which does not accept domain:port
. I guess I could clean this up by using hostname
instead of netloc
, but if the complete URL is supposed to be validated these lines are most probably not staying anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did add my test case in test_comments.py.
Do we actually need to check the domain name at all? |
Because IIRC the website is inserted as a link (if given), we should make sure it is valid. If we skipped the validity check, I'm not sure that the markup escaping would catch e.g. someone entering malicious Javascript |
That seems like an argument for just fixing the markup escaping to me... |
So, the goal is to actually check the complete entered URL, not only if the domain is valid, as I assumed? |
Sorry for the delay; let's just merge this, since it's clearly an improvement over the current situation. Ideally, we'd be moving away from checking for valid domains to checking for malicious things though. |
Checklist
CHANGES.rst
because this is a user-facing change or an important bugfixWhat changes does this Pull Request introduce?
Changed website validation to allow domain names containing umlauts
Why is this necessary?
Resolves issue #951