Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepcopy: recurse through immutable objects with mutable fields. #8560

Merged
merged 3 commits into from
Oct 3, 2014

Conversation

StefanKarpinski
Copy link
Member

I've encountered a number of cases which have caused me to believe that recursing through immutable types with mutable fields is the right thing to do. One example (currently broken on master), is OrderedDict [1] from DataStructures – this type is a lightweight, immutable wrapper around HashDict [2], which is mutable. The current deepcopy behavior means that doing od2 = deepcopy(od1) simply returns od1, which is clearly not what is wanted. This could be fixed in the DataStructures package, but I suspect wanting to copy through immutables is the rule, not the exception.

Philosophically, the only coherent way to define at a conceptual level what deepcopy should do that I've come up with is that it should be functionally equivalent to serializing and then deserializing an object. That would recursively copy through immutable types with mutable fields, which is another justification for this change.

I believe that not sticking immutable objects in stackdict is correct, but I'm not 100% sure. My reasoning is that in order for a path from the root of a data structure to have a loop, it has to have passed through a mutable object since otherwise you couldn't have constructed the structure in the first place (unless you cheated, in which case you have only yourself to blame), at which point the deepcopy recursion will stop once that mutable object is reached a second time. There may be some unnecessary copying of immutable objects, but I think it's harmless.

[1] https://github.com/JuliaLang/DataStructures.jl/blob/master/src/ordereddict.jl
[2] https://github.com/JuliaLang/DataStructures.jl/blob/master/src/hashdict.jl

@garborg
Copy link
Contributor

garborg commented Oct 2, 2014

This bit me early on, but I didn't feel knowledgable enough to comment at the time. The proposed behavior makes more sense to me.

@kmsquire
Copy link
Member

kmsquire commented Oct 2, 2014

Makes sense to me. +1

@JeffBezanson
Copy link
Member

Goodness, I'm not sure I realized we were doing this currently. This change is definitely better. It would probably be a worthwhile optimization to avoid copying immutable values that are also isbits.

@StefanKarpinski
Copy link
Member Author

So just basically just change the first line to isbits(T) | isempty(T.names) && return x?

@JeffBezanson
Copy link
Member

Yes.

@StefanKarpinski
Copy link
Member Author

I guess everyone is cool with this then. Will add a NEWS entry and merge later unless I hear otherwise.

StefanKarpinski added a commit that referenced this pull request Oct 3, 2014
deepcopy: recurse through immutable objects with mutable fields.
@StefanKarpinski StefanKarpinski merged commit 60a9f0a into master Oct 3, 2014
@kmsquire
Copy link
Member

kmsquire commented Mar 3, 2015

This should be okay to backport, right? JuliaCollections/DataStructures.jl#75 is exactly the issue @StefanKarpinski mentions in the description above.

@tkelman
Copy link
Contributor

tkelman commented Mar 3, 2015

I guess. Could this conceivably break anything? Or conversely, how likely would it be that people could begin relying on changed behavior for 0.3.7-or-later without realizing it or being careful about package REQUIREs?

@kmsquire
Copy link
Member

kmsquire commented Mar 3, 2015

The previous behavior is pretty sketchy, so I think it's unlikely that someone would be relying on it. I also think that this is a bug fix, and that everything up through 0.3.6 is broken.

Or conversely, how likely would it be that people could begin relying on changed behavior for 0.3.7-or-later without realizing it or being careful about package REQUIREs?

I guess that someone could have redefined deepcopy to work around this behavior in some package, notice it's been fixed, and remove that workaround, breaking the packages on v0.3.6 or earlier. Is the better solution then, to make sure that every package who needs the correct behavior implement a workaround for deepcopy? The only packages that I have installed that redefine deepcopy are DataArrays and DataFrames, but I don't have many packages installed right now.

I can understand either way--I just find the latter idea annoying.

@tkelman
Copy link
Contributor

tkelman commented Mar 3, 2015

If a few packages are already defining workarounds, then they should either adjust their REQUIRE if they remove the workaround, or do a Compat-style check against VERSION to conditionally apply the workaround after this gets backported. As long as the relevant people are paying attention and careful about doing that, then good enough for me.

StefanKarpinski added a commit that referenced this pull request Mar 11, 2015
(cherry picked from commit 5807658)
ref PR #8560

deepcopy optimization: return bits types immediately.

(cherry picked from commit 8b0d0f6)

deepcopy: update NEWS.md and docs.

(cherry picked from commit 4837f35)

Conflicts:
	NEWS.md
@tkelman
Copy link
Contributor

tkelman commented Mar 11, 2015

squashed and backported in b55539a

@tkelman tkelman deleted the sk/deepcopyr branch April 19, 2015 11:43
@PallHaraldsson
Copy link
Contributor

"added the backport pending label on Mar 3"

I was looking at NEWS (seeing what might be new in 0.3.8 and found nothing):

"squashed and backported in b55539a"

Was this supposed to say 0.3.8 not 0.3.7 in NEWS?

I assume you always backport just one version back (e.g. never by now 0.2, but also for the forseeable future only one back) and to the newest minor version, right? I'm not even sure you could (or would want to) backport a version (0.3.7) that is already out.

@tkelman
Copy link
Contributor

tkelman commented May 4, 2015

That backport was in March, prior to the release of 0.3.7. I only modify NEWS.md on release-0.3 if the commit I'm backporting touches that file. It's very rare to backport commits that are significant enough to warrant mentioning in NEWS.md.

@StefanKarpinski
Copy link
Member Author

Indeed, any commit that touches NEWS almost by definition doesn't belong in a bug-fix release.

@PallHaraldsson
Copy link
Contributor

Ok. It just felt March 3 (Mar 3 looks like maí=May in my language..) wasn't that long ago.. but it's pre-0.3.7 then.. Still, assume (non-NEWS-worthy) backports only done as I described.

@tkelman
Copy link
Contributor

tkelman commented May 4, 2015

We might conceivably do a few more backports to release-0.3 even after 0.4.0 gets released, but only if it's fixing some especially important bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants