-
-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nokogiri::XML::Node#add_child_node_and_reparent_attrs
behaves incorrectly if an attribute name has a colon
#1790
Comments
Thank you for submitting this! I'm just back from vacation and it will take a few days to catch up on everything. Thanks for your patience. |
In retrospect, I'm not sure what this function is supposed to do at all. If I'm reading |
Hey, @stevecheckoway! Finally getting around to looking at this.
Yes, you're right. This method originally came from #870 which was a fix for #869; but the actual root cause was that I just wrote #2310 which fixes the underlying implementation of Side note: this refactor became much easier after #1712 was fixed by #2246 and #2228 was fixed by #2230, because the reparenting/namespacing behavior was much less buggy.
Again, this method came from #870 which was a fix for #869. Specifically, it's intended to handle cases like this: Nokogiri::XML::Builder.new do |xml|
xml.root('xmlns:foo' => 'bar') {
xml.obj('foo:attr' => 'baz')
}
end We're trying to resolve this ambiguity:
If we were re-designing the Builder API, we could probably allow for explicitly setting the namespace for that attribute. However, there's an analogous use case that doesn't use Builder: doc1 = Nokogiri::XML('<root/>')
doc2 = Nokogiri::XML('<root xmlns:foo="http://foo.io"/>')
node = doc1.create_element('obj', 'foo:attr' => 'baz')
# at this point, the attribute is named `foo:attr`
# #<Nokogiri::XML::Element:0xba4 name="obj" attributes=[#<Nokogiri::XML::Attr:0xb90 name="foo:attr" value="baz">]>
doc2.root.add_child(node)
# but now the attribute is `attr` with a namespace
# #<Nokogiri::XML::Element:0xba4 name="obj" attributes=[#<Nokogiri::XML::Attr:0xbcc name="attr" namespace=#<Nokogiri::XML::Namespace:0xbb8 prefix="foo" href="http://foo.io"> value="baz">]> Your suggested fix breaks this "namespacing" process for the attribute in question. Normally this is not an issue if the implied prefix doesn't correspond to a namespace in the document -- for example, using
but unfortunately in an XML document the @stevecheckoway WDYT of an alternative approach: what if we skipped |
See #2310 for what I think is a good fix for all of this that in fewer lines of code. |
What problems are you experiencing?
Nokogiri::XML::Node#add_child_node_and_reparent_attrs
uses an incorrect test to see if a node's namespaces need to be reparented (at least, I think that's the purpose of this code).It's testing
a.name =~ /:/
to decide to reparent. This breaks HTML elements which are allowed to have colons in their attribute names. (Essentially, HTML only allows foreign elements to have explicit namespaces.)It's a little convoluted to demonstrate the problem using the Nokogiri API.
This prints
I note that using
doc.root = html
doesn't do this reparenting.Here's a backwards compatible fix.
This prints
Ideally, an API for manipulating attributes in a given namespace that doesn't do the parsing based on colon (e.g., uses
xmlNewProp
rather thanxmlSetProp
) would be fantastic.What's the output from
nokogiri -v
?Can you provide a self-contained script that reproduces what you're seeing?
The text was updated successfully, but these errors were encountered: