Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbols can't be used as WeakMap keys #1194

Closed
devsnek opened this issue May 15, 2018 · 102 comments · Fixed by #2777
Closed

Symbols can't be used as WeakMap keys #1194

devsnek opened this issue May 15, 2018 · 102 comments · Fixed by #2777

Comments

@devsnek
Copy link
Member

devsnek commented May 15, 2018

Symbols cannot be used as WeakMap keys as they are primitives. However unlike other primitives they are unforgable and unique, so they should be completely safe for use as WeakMap keys.

@ljharb
Copy link
Member

ljharb commented May 15, 2018

Global symbols (Symbol.for, Symbol.keyFor) are not unique, and as primitives, can't ever be collected - I assume that since these kinds of symbols can't be WeakMap keys, and because it would be confusing for some symbols to be usable as WeakMap keys but not others, that it makes the most sense to disallow all symbols as WeakMap keys?

@erights
Copy link

erights commented May 15, 2018 via email

@devsnek
Copy link
Member Author

devsnek commented May 15, 2018

i would lean towards allowing global symbols since they aren't the only things that could prevent collection. any value used as a weakmap key can be attached places such that it won't be collected. beyond that, Symbol() symbols should definitely be allowed.

@tabatkins
Copy link

I don't see a significant difference between a registry-Symbol and an object that's stored on the global. Both will never be collected, and thus will prevent a weak value from being collected.

But if we really do think this is a footgun, then we can just disallow registry-Symbols from being used (those for whom Symbol.keyFor returns a non-undefined value). The registry can't be manipulated by the user; Symbols are inserted into it by the UA immediately upon creation and never removed, so "in the registry" is effectively a constant quality of the Symbol itself, and can be relied on.

It just seems silly that {} can be a key but Symbol() can't, given that they can serve similar purposes.

@ljharb
Copy link
Member

ljharb commented May 15, 2018

You can always prevent collection; the difference is that with objects you can always allow collection by dropping all refs to the object - or by having the realm itself collected.

With global symbols - which are cross-realm - it would prevent ever collecting it.

@bmeck
Copy link
Member

bmeck commented May 16, 2018

@ljharb If a WeakMap is reaped it does not prevent collecting values with eternal keys since that weak reference to the value is removed, even if the key itself is strongly held still.

@bmeck
Copy link
Member

bmeck commented May 16, 2018

@erights On a tagent that might come back to this, is there any reason having keys mismatch between Weak collections and WeakRef might be problematic? I know for WeakRef it would never fire a finalizer for a Symbol from the SymbolRegistry if it were allowed as a key (but it would strongly keep the finalizer/holdings alive I think?).

@gibson042
Copy link
Contributor

With global symbols - which are cross-realm - it would prevent ever collecting it.

Isn't the same true of the outermost global object?

@ljharb
Copy link
Member

ljharb commented May 16, 2018

@gibson042 if the realm is collected, the global object for it could also be (assuming it was a key in a WeakMap from a different realm)

@gibson042
Copy link
Contributor

Right, but I was referring to the global object of the outermost realm (though I suppose such a statement reads as vacuously true). Still, the set of well-known Symbols shared across realms is necessarily bounded and small, and their uncollectability shouldn't be a concern. Or am I missing something?

@bmeck
Copy link
Member

bmeck commented May 17, 2018

I'm not sure this talk about keys really matters if the WeakMap being reaped allows what it is holding be reaped? Even if I create a map using:

let map = new WeakMap();
map.set(Symbol.iterator, BIG_OBJECT); // Symbol.iterator is shared between all realms as well
map = null;

BIG_OBJECT can still be collected since the map can be collected.

@loganfsmyth
Copy link
Contributor

One thing that comes to mind here is that doesn't seem to be polyfillable without leaking. Are there any concerns around that?

@bmeck
Copy link
Member

bmeck commented May 24, 2018

@loganfsmyth isn't that a concern with any Weak collection polyfill?

@devsnek
Copy link
Member Author

devsnek commented May 24, 2018

@bmeck traditionally the polyfills assign a "hidden" property on the key that holds the value. that can't be done with a Symbol.

@benjamn
Copy link
Member

benjamn commented May 24, 2018

In a recent project, I resorted to implementing a class called UniversalWeakMap, which maintains a WeakMap and a Map instance property, called this._weakMap and this._strongMap, and simply stores any non-reference keys (which can't be stored in this._weakMap) in the this._strongMap instead, so that it can expose a uniform get/set interface that works for any kind of key. Of course the non-reference keys are strongly held (including Symbols, unfortunately), but the whole UniversalWeakMap object could potentially be garbage collected.

Note: for this particular project, I didn't need to match the shared WeakMap/Map interface exactly, which is why my UniversalWeakMap doesn't have methods like has and delete, though they would be easy to implement.

I mention this example as evidence that it would be genuinely useful to have fewer restrictions on what you can put in a WeakMap, as long as you don't mind the drawbacks (in particular, you don't care about iterability).

To the specific question of whether Symbol keys should be allowed in WeakMaps, that would certainly be more convenient than the status quo. However, I would go a step further, and recommend that any kind of key should be allowed in a WeakMap. None of the keys (weak or strong) should be iterable, obviously. Those keys that can be garbage collected should disappear from the WeakMap whenever they become unreachable, and those keys that can't be collected (either because they're a kind of value that can never be collected, or they just never happen to become unreachable) should simply remain in the WeakMap until they are explicitly removed by the program.

In either case, a native WeakMap implementation will not keep any keys from being collected that could otherwise be collected, so I don't see any potential for confusion or memory leaks—in a native implementation, at least, which is why this would need to be standardized rather than just polyfilled.

@devsnek
Copy link
Member Author

devsnek commented May 24, 2018

forgable values can't be done because we want to prevent observing gc

@benjamn
Copy link
Member

benjamn commented May 24, 2018

@devsnek Forgeable Symbol values by definition can't be collected, since they have to be === if you forge them again later, so there's nothing to observe?

@ljharb
Copy link
Member

ljharb commented May 24, 2018

Observing GC isn’t a concern; WeakRefs are coming, either via JS or via WASM, so it will be observable.

@benjamn
Copy link
Member

benjamn commented May 24, 2018

Especially once we have WeakRefs and thus observable GC (but even right now), I honestly don't see the benefit (for developer expectations or memory usage or any other reason) for refusing to store non-reference keys in a WeakMap.

@erights
Copy link

erights commented May 25, 2018 via email

@ljharb
Copy link
Member

ljharb commented May 25, 2018

@erights thanks for clarifying.

In that case, can anyone elaborate on why WeakMaps can't accept all values as keys? (@erights, @allenwb?)

@bakkot
Copy link
Contributor

bakkot commented May 25, 2018

Personally, I expect that if I put a thing in a WeakMap then the corresponding value can someday be GC'd. For example, I have tons of code which does stuff like

const cache = new WeakMap;
function process(data) {
  if (cache.has(data)) {
    return cache.get(data);
  }
  const res = expensiveAlgorithm(data);
  cache.set(data, res);
  return res;
}

with the expectation that this caching is acceptable because the caller can always choose to drop data if they want res to stop taking up memory.

I really don't think it's a good idea to mix strong and weak holding in the same data structure. Yes, I know that sometimes the global object ends up being effectively strongly held, but that's an extremely minor and extremely edge-y edge-case.

@woess
Copy link
Contributor

woess commented May 26, 2018

Sorry to sidetrack, but what do you mean by (un)forgeable values?

@devsnek
Copy link
Member Author

devsnek commented May 26, 2018

@woess forgability refers to the ability to create a value that has the identity of another value without having access to the original value. two examples of this are numbers and strings. i can create 1 and have it be identical to this other 1 over here, without having access to the original 1. as with strings i can create "hello" in one place and "hello" in another place and they are identical.

on the other hand we have things like objects and symbols which are unforgable. if i have Symbol() in some place and Symbol() in another place they will not be identical. The only way to have identity is to grab that original symbol. The same goes with objects ({} === {} is false)

@erights
Copy link

erights commented May 26, 2018

@devsnek Good explanation of unforgeable. However, only unnamed symbols --- those created by the Symbol() expression you showed --- are unforgeable. Named symbols --- those created by Symbol.for(str) --- are not unforgeable, creating the dilemma at issue in this thread.

@bmeck
Copy link
Member

bmeck commented May 26, 2018

@erights I don't think that is true, the GlobalSymbolRegistry is returning the exact same Symbol rather than creating new primitive values in https://tc39.github.io/ecma262/#sec-symbol.for . Those things are reachable but not forgeable. I cannot recreate them without access to that registry.

@erights
Copy link

erights commented May 26, 2018

Access to the registry is not deniable. Given that everyone has implicit access to the registry, they can obtain access to any named symbol given only knowledge of the string. IOW, access to a named symbol is "knowledge limited" rather than "access limited".

We were very careful to design the semantics of the registry so that it would not be a global communications channel. Given the way the semantics of the registry are stated, this safety property is hard to see. A better way to describe its semantics is that a named symbol is a value, without identity, that wraps a string. All the equality comparison operators, given two named symbols, judges them to be equal iff their wrapped strings are equal. This account is not observably different and need not hypothesize any registry or any other form of shared state. In this account, it is obvious there is no global communications channel.

@bmeck
Copy link
Member

bmeck commented May 26, 2018

@erights https://tc39.github.io/ecma262/#sec-samevaluenonnumber appears to compare that they are the same value, not based upon any internal string that I can tell. It could be implemented as potentially multiple values being checked by some internal string that appear to act as a single value, but is not what the spec appears to be saying. Do you think this a bug in the specification?

@littledan
Copy link
Member

@devsnek Yes, I understand that this is a disadvantage; how should we weigh the cost of that against the cost of not having some kind of capability in this area? (Making a separate, parallel type was another way of avoiding the mismatch.) It feels to me, overall, like this is a solvable problem, and we just have various tradeoffs that we can make about the solution.

@devsnek
Copy link
Member Author

devsnek commented Apr 27, 2020

@littledan i think its less about specific tradeoffs and more that we're in deadlock (some people don't want all symbols, some people don't want only specific kinds of symbols)

@littledan
Copy link
Member

Lots of times, you can get through deadlock by considering the whole space, the value of the proposal overall, and weighing it against the cost of various alternatives. We've gotten through various controversial/deadlocked things at TC39 before this way.

@littledan
Copy link
Member

Overall, I think Symbols as WeakMap keys would be a very useful base for being able reference objects from primitives (in the context of the Records and Tuples proposal), by indirecting through the WeakMap.

The implication chain goes like this:

  • A core benefit of Records and Tuples is that === does deep comparison
  • === is a reliable operation, so it's not something that Proxy or Symbols or operator overloading could trap
  • Records and Tuples need to be primitives, so that they can have === semantics which are not based on object identity
  • These primitives cannot be membrane-wrapped, so there's no way they could directly contain Objects, since then these objects would pierce through the membrane. Access to objects from Records and Tuples would need to be provided by some other object, access to which could be membrane-wrapped.
  • The references to Objects which are immediately referenced in Records and Tuples need to be through a primitive type, such as Symbol (or a new "box" type)
  • There would need to be some kind of mapping from some kind of primitive which has an identity (such as Symbol or "box") to objects, in order to support Records and Tuples logically referencing objects (without manually maintaining a parallel data structure)

Symbols as WeakMap keys seems simpler than adding a "box" type, even though we don't need to use these things as property keys.

Such a system would preserve our typical invariants protecting membranes/ocap systems, since object operations are used to access the WeakMap. While it would be ergonomically nice and composition-promoting to have a single built-in mapping from Symbols (or some other primitive) to objects, this wouldn't meet ocap goals, since it would constitute built-in shared state, providing a cross-compartment communication channel for multiple compartments sharing the same frozen Realm containing all of TC39's unmodified primordials.

The open questions I see are:

  • Can registered symbols be used as WeakMap keys? @ljharb has expressed that they should be (if we support this at all), and @erights has expressed that they shouldn't be. Personally, I think either option would be an acceptable choice:
    • Allowing registered symbols doesn't seem so bad, since registered Symbols are similar to Objects that are held alive for the lifetime of the Realm. Things like Symbol.iterator are similar to primordials like Object.prototype and Array.prototype, and registered Symbol.for() symbols are similar to properties of the global object. Just because these will stay alive doesn't mean we disallow them as WeakMap keys.
    • Prohibiting registered symbols doesn't seem so bad, since it's already readily observable whether a Symbol is registered, and it's not very useful to include these as WeakMap keys. Therefore, it's hard to see what practical or consistency problems the prohibition would create.

I hope we can work out a solution here among the available options.

  • We could support Symbols in WeakRefs and FinalizationRegistry, or not. I don't have any use case in mind, but it would seem consistent with adding them as WeakMap keys. I imagine we should support Symbols as WeakSet entries as well (I don't think this should be controversial if we agree on Symbols as WeakMap keys).

I plan to propose "Symbols as WeakMap keys" for Stage 1 at the June 2020 TC39 meeting. By going through the stage process, I hope we can develop consensus on answers to these questions little by little.

littledan added a commit to littledan/ecma262 that referenced this issue Jun 9, 2020
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
littledan added a commit to littledan/ecma262 that referenced this issue Jul 19, 2020
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
leobalter pushed a commit to leobalter/ecma262 that referenced this issue Apr 8, 2021
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
leobalter pushed a commit to leobalter/ecma262 that referenced this issue May 25, 2021
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
@trusktr
Copy link

trusktr commented Jul 2, 2021

It would be nice if Symbol.for, which is currently similar to a Map<string, Symbol>, was actually some a special API similar to a WeakMap<string, Symbol> that holds any symbols it creates weakly (that's not currently possible with WeakMap, and this wouldn't actually be a WeakMap, just something similar).

Then, if a symbol returned by Symbol.for is no longer referenced by user code while Symbol.for holds symbols weakly, the symbol could be collected, and hence using any symbols (even those made with Symbol.for) would allow WeakMap contents to always be collectable when keys are symbols of any kind that are no longer referenced.

This would cancel out the concern of foregeable vs unforgeable symbols, if I understand correctly.

I think this would be fine for Symbol.for symbols (even if they are cross-realm), because if they aren't referenced (and hence can be collected), then there is no way that code obtaining a symbol using the same name as before would ever know that the symbol is a new one and not one that existed before.


Breaking change:

This would break code that currently relies purely on Symbol.for for getting references, but I think that's a bad pattern anyway. Such code should (and can easily) be updated to store symbols in variables to make them explicitly referenced when needed.

Symbol.for is useful for cross-realm symbols, but it should not have had the added convenience of allowing people to skip variables.

Maybe solutions:

An idea could be to make Symbol.for(name) deprecated, and add a new API called Symbol.crossRealm(name) (or some name from the bike shed). Then remove Symbol.for from all docs, and maybe 5 or 10 years from now engines can remove Symbol.for.

Another idea is to at least introduce an API for cleanup like Symbol.remove(name) / Symbol.remove(Symbol.keyFor(sym))


As usual, I always proclaim that all APIs should be designed with the ability to clean up whatever they create. Symbol.for is currently not one of those APIs. Neither is import which causes permanent mem growth by not allowing modules to be collected, or by not having an API to explicitly destroy modules.

All APIs should be designed with the ability to clean up whatever they create.

That's a requirement for well-designed re-usable APIs.

@mhofman
Copy link
Member

mhofman commented Jul 14, 2021

It would be nice if Symbol.for, which is currently similar to a Map<string, Symbol>, was actually some a special API similar to a WeakMap<string, Symbol> that holds any symbols it creates weakly (that's not currently possible with WeakMap, and this wouldn't actually be a WeakMap, just something similar).

Then, if a symbol returned by Symbol.for is no longer referenced by user code while Symbol.for holds symbols weakly, the symbol could be collected, and hence using any symbols (even those made with Symbol.for) would allow WeakMap contents to always be collectable when keys are symbols of any kind that are no longer referenced.

This would cancel out the concern of foregeable vs unforgeable symbols, if I understand correctly.

I think this would be fine for Symbol.for symbols (even if they are cross-realm), because if they aren't referenced (and hence can be collected), then there is no way that code obtaining a symbol using the same name as before would ever know that the symbol is a new one and not one that existed before.

I believe current implementations already are a sort of WeakValueMap<string, Symbol> under the hood, with the Symbol collected once it is no longer observable. The concern implementors seem to have is that allowing the use of these registered Symbols as WeakMap keys or WeakRef/FinalizationRegistry target may allow their collection to be observed.

let wm = new WeakMap();
let s1 = Symbol.for('foo');
wm.set(s1, 8);
s1 = null;
// say a full gc happens here
let s2 = Symbol.for('foo');
console.log(wm.get(s2));

As I explained in tc39/proposal-symbols-as-weakmap-keys#19 (comment), the observation can only be made while the weak collections and weak refs stay reachable. Once these objects containing the registered symbol as key or target is unreachable, the implementation is free to unobservably collect the registered symbol.

This would break code that currently relies purely on Symbol.for for getting references

Could you clarify what you mean? If the user program is not holding the symbol itself, by definition it is not using it for anything, and cannot observe if the implementation is actually using another symbol instance. For that reason, the Symbol.for() API is not creating garbage any more than regular objects, which don't require explicit destructors either.

acutmore pushed a commit to acutmore/ecma262 that referenced this issue May 12, 2022
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
acutmore pushed a commit to acutmore/ecma262 that referenced this issue May 16, 2022
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.

Closes tc39#1194
acutmore added a commit to acutmore/ecma262 that referenced this issue Jan 19, 2023
Specification PR for https://github.com/tc39/proposal-symbols-as-weakmap-keys
Note that both registered and unregistered Symbols are permitted here.
rename AO to CanBeHeldWeakly and include WeakRef and FinilisationRegistry
WIP: extend weakrefs to symbols
Update liveness section

Closes tc39#1194

Co-authored-by: Mathieu Hofman <[email protected]>
Co-authored-by: Richard Gibson <[email protected]>
Co-authored-by: Jordan Harband <[email protected]>
Co-authored-by: Daniel Ehrenberg <[email protected]>
Co-authored-by: Leo Balter <[email protected]>
Co-authored-by: Ashley Claymore <[email protected]>
acutmore added a commit to acutmore/ecma262 that referenced this issue Feb 6, 2023
- Proposal: https://github.com/tc39/proposal-symbols-as-weakmap-keys
- Also allows Symbols in WeakSet, WeakRef, and FinalizationRegistry
- Adds new AO 'CanBeHeldWeakly'
- Registered Symbols can not be held weakly

Closes tc39#1194

Co-authored-by: Daniel Ehrenberg <[email protected]>
Co-authored-by: Leo Balter <[email protected]>
Co-authored-by: Mathieu Hofman <[email protected]>
Co-authored-by: Richard Gibson <[email protected]>
Co-authored-by: Jordan Harband <[email protected]>
Co-authored-by: Shu-yu Guo <[email protected]>
Co-authored-by: Michael Dyck <[email protected]>
Co-authored-by: Michael Ficarra <[email protected]>
syg added a commit to syg/ecma262 that referenced this issue Mar 16, 2023
- Proposal: https://github.com/tc39/proposal-symbols-as-weakmap-keys
- Also allows Symbols in WeakSet, WeakRef, and FinalizationRegistry
- Adds new AO 'CanBeHeldWeakly'
- Registered Symbols can not be held weakly

Closes tc39#1194

Co-authored-by: Daniel Ehrenberg <[email protected]>
Co-authored-by: Leo Balter <[email protected]>
Co-authored-by: Mathieu Hofman <[email protected]>
Co-authored-by: Richard Gibson <[email protected]>
Co-authored-by: Jordan Harband <[email protected]>
Co-authored-by: Shu-yu Guo <[email protected]>
Co-authored-by: Michael Dyck <[email protected]>
Co-authored-by: Michael Ficarra <[email protected]>
@ljharb ljharb closed this as completed in 6e70bb6 Mar 18, 2023
@cpcallen
Copy link

cpcallen commented Mar 19, 2023

Excluding registered symbols as valid WeakMap keys compromises the feature so badly that I don't really consider this bug to actually be fixed since (e.g.) I can't write a library with an API that takes arbitrary symbols and uses a WeakMap to store associated information about them.

Yes, it's a bit better that at least some private data indexed by symbol can be garbage collected, but it's still necessary to maintain parallel Map and WeakMap structures for (AFAICT) no good reason other than to appease some hypothetical registry implementors and protect unwary developers from their own ignorance.

I'm really sorry that you didn't stick to your guns about treating all symbols the same, @ljharb.

@mhofman
Copy link
Member

mhofman commented Mar 19, 2023

I don't understand the hypothetical implementation comment. There are a multitude of implementations, some that can collect registered symbols, some that can't. However in either case, these symbols are specified to be forgeable, which means that the program should not be able to observe their collection.

That means that regardless of the implementation, if a registered symbol is used as a WeakMap key, it must in practice be held strongly and the entry cannot be collected (technically registered symbols can still be collected if their collection is not observed by a program making use of WeakRef and/or FinalizationRegistry, either directly, or indirectly through a chain of associated WeakMap values, but that would require an even more complex ephemeron-like logic).

Instead of forcing the implementation to pay the cost of this dual Map/WeakMap logic internally, the onus is on the program to handle registered symbols if it absolutely must treat all symbols equally.

I would be very curious to hear about the use case requiring storing blindly all symbols in a WeakMap, yet other forgeable values like string and numbers don't have the same requirement. Why can't your library require unique symbols?

@ljharb
Copy link
Member

ljharb commented Mar 19, 2023

@cpcallen registered symbols shouldn't ever have existed - they're just globals v2 - so i thought it was a viable compromise to allow the good kinds of symbols to be held weakly.

@hax
Copy link
Member

hax commented Mar 20, 2023

registered symbols shouldn't ever have existed

I have had a question for a long time --- what's the original motivation of registered symbols?

@syg
Copy link
Contributor

syg commented May 14, 2024

registered symbols shouldn't ever have existed - they're just globals v2

I think you mean strings instead of globals. Basically allowing registered symbols is the same as allowing strings, which remains a bad idea for the reasons @mhofman says.

@cpcallen If your issue is that you want to type typeof x === 'symbol', you may be interested in @ljharb's symbol predicates proposal for Symbol.isRegistered. (I'm still not convinced of the need for isWellKnown, but there's a compelling need for isRegistered as shown here.)

@ExE-Boss
Copy link
Contributor

ExE-Boss commented May 15, 2024

@syg

@cpcallen If your issue is that you want to type typeof x === 'symbol', you may be interested in @ljharb's symbol predicates proposal for Symbol.isRegistered. (I'm still not convinced of the need for isWellKnown, but there's a compelling need for isRegistered as shown here.)

You can already do:

/** @param {symbol} sym */
const isRegisteredSymbol = (sym) => {
	return Symbol.keyFor(sym) !== undefined;
}

@ljharb
Copy link
Member

ljharb commented May 15, 2024

@ExE-Boss close - you'd also need typeof sym === 'symbol' && after the return.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet