-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add RFC 3986 and WHATWG compliant URL parsing support #14461
base: master
Are you sure you want to change the base?
Conversation
@@ -3715,7 +3715,6 @@ function uniqid(string $prefix = "", bool $more_entropy = false): string {} | |||
|
|||
/** | |||
* @return int|string|array<string, int|string>|null|false | |||
* @compile-time-eval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed for benchmarking purposes
ext/url/php_url.c
Outdated
|
||
static void cleanup_parser(void) | ||
{ | ||
if (++URL_G(urls) % 500 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach is copy-pasted from lexbor/lexbor#206
823e4c6
to
e65deb7
Compare
0ccffa0
to
04d4e6e
Compare
I need to pull in a new version of Lexbor as well in 8.4 for the GB18030 changes so I'll handle the encoding changes. |
@nielsdos Ah thanks for the link! I totally missed the readme :/ TBH I silenced the compiler errors with I was about to report your findings back to my Lexbor issue, but I've just seen that you beat me to do it :) |
I'll make a PR to update Lexbor now, then once that's merged you can rebase on master and get rid of the warning silencer. |
acb1a98
to
5a25f1a
Compare
13e8566
to
8e07430
Compare
d894dad
to
8e21e67
Compare
ext/uri/php_uri.c
Outdated
RETURN_THROWS(); | ||
} | ||
|
||
zend_result result = this_internal_uri->handler->normalize_uri(this_uri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nielsdos I would like to ask your opinion about a problem I recently realized.
Context: I added a normalize_uri
handler that is able to normalize the URI. It is used in multiple places (e.g. normalize()
, toNormalizedString
, equalsTo()
methods). However, this would possibly cause some unexpected behavior when a built-in URI is extended in userland: if a child class overrides the normalize()
method, they would legitimately assume that doing so will affect the rest of the relevant methods (toNormalizedString()
, equalsTo()
), but no, this won't be the case, since the internal handler is reused in fact... which PHP programmers have no control over. I can see two solutions so far:
- make the built-in URI implementations
final
: there's no more problem with method overrides, but using composition instead of inheritance requires more work (implementing all the methods of theUriInterface
), you can't just substitute an internal URI class with your own implementation etc. - the
normalize()
method itself should be called instead of thenormalize_uri
internal handler: this solves the issue, but it results in quite some performance overhead
Do you have any preference/insights about this question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kocsismate Interesting.
Here's an idea:
We can know upfront if the "normalize" method is overridden when we create the internal uri object. We can do this by checking whether the entry for "normalize" in the function table has type ZEND_USER_FUNCTION
instead of ZEND_INTERNAL_FUNCTION
. In that case, we know we should call the user handler, otherwise we should call the internal handler.
Create a helper function invoke_normalize_uri
(name is just illustrative) that you use to invoke either the internal normalize handler or the user function. Then you can use invoke_normalize_uri
everywhere without worrying. To make the invocation fast you can have a zend_function*
field in the internal object that is NULL
when the user did not override it, and points to the user function if the user did override it. Then you can use the Zend APIs to call that user function if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nielsdos Thanks for the brilliant idea! :) That's really impressive. I'll have to think about in which cases a similar issue can happen, but if it's not too frequent, then your solution will perfectly mitigate it. I wouldn't like though if i.e. all the property reader functions would have to use this workaround, but hopefully it won't be the case.
18e80e0
to
d0b9a3d
Compare
2nd take after the failed experiment with #11315
RFC: https://wiki.php.net/rfc/url_parsing_api