-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] API to associate binding state (pointers) with the JerryScript context #1717
Comments
cc @jiangzidong @zherczeg ideas? |
What about adding two functions: one for setting a global user pointer and another to getting this pointer? |
That would mean you'd have to put all the pieces of state from various bindings into one big struct: not great for keeping the various bindings modular. I added a "design goal" to the list: enable multiple bindings to each associate their own state with a JerryScript context. |
About "scale down": |
It's hard to implement a general binding state struct/array so that we can reuse other binding codes directly, because binding scenarios are various. |
As for me I would offer such things as an option. From JerryScript core side it would only be a pointer, but there would be utilities which offer more, and people could select what fits their use case. I described this concept in #1716. |
You have to also consider that there may be multiple mutually exclusive uses for adding a pointer to the context. For example, V8's isolate has So, if you're a binding and you retrieve the pointer from the context, is this the pointer you set, or is this the pointer some other binding set, because its init ran before your init? Fortunately, we're dealing with a JavaScript engine here, right? So, we could create an object that is not exposed to JavaScript and attach it to the context, and then various bindings could claim various property keys on the object and attach their data as the value. So, like #define MY_BINDING_DATA_KEY "my_binding_data_key"
...
my_binding_data *my_data = get_my_binding_data();
jerry_value_t context_object = jerry_get_context_object(context);
jerry_value_t my_property_value = jerry_create_object();
jerry_set_object_native_handle(my_property_value, (uintptr_t)my_data, free_my_binding_data);
jerry_value_t prop_name = jerry_create_string_from_utf8(MY_BINDING_DATA_KEY);
jerry_set_property(context_object, prop_name, my_property_value);
... Basically, I wish to caution that a single pointer is insufficient because there may be multiple independent reasons for attaching data to a context. Thus, we should be exploiting the fact that JavaScript objects are string-keyed hashes. The example above could be made more efficient with a custom API, of course. After all, we do not need entire JS objects hanging from the one context object, but only native pointers. The important thing IMO is that the pointers are stored in a string-keyed hash, though. |
For reference: nodejs/abi-stable-node#206 (comment) |
Basically, we ended up going with a per-module context, allocated when the module is first initialized, and then stored in the context for all native callbacks. This has the following downside: nodejs/node#12246 (review) |
Another nice benefit of implementing the JS external-data-plus-deleter approach is that, when a context is discarded, the deleters associated with the data will also be called, and so the bindings will be informed that a certain context has been discarded. This is also missing in V8. |
Hm, what about reversing the question? Instead allowing setting pointers in JerryScript context, we could retrieve the starting address of the context, which should be unique. This could be a key to access various local properties. |
So have a context-keyed hash stored statically and globally in the binding, right? That's also possible, of course, as long as there is one well-known, well-documented key that can be used. Currently it sounds like the |
Then the binding won't know that a context has been discarded though and so it won't know that it can delete a key from the hash. There should be a callback to inform about that. |
Also it might be real nice if bindings didn't have to implement their own hash table, because five different bindings will implement it five different ways. |
It just ends up being boilerplate. |
Yes. I do think the global context must be unique for each JerryScript instances, regardless where it is stored. Otherwise they would mess up the global state of others. The other thing which could be acceptable is having a chain list of pages, where the head is stored in JerryScript, and each page must start with a next and key pointer pair. The key pointer should point to a static const data, where this data could start with a pointer to a free function. This free function could be called when a JerryScript instance is destroyed. The downside is that each data binding must do a linear search to find their corresponding page, which could be costly if too many of them. The problem is no matter what we offer, there will be downsides, and there will be people who will don't like those downsides. That is why I prefer a system, where people can choose their appropriate solution, rather then forcing one onto them. |
If that's the case, you're basically asking various independent projects to agree on a way to attach arbitrary data to a context, and implement a library that does that. You will then make room in the context by giving them a |
It will likely working this way. Probably there will be 2-3 major libraries which projects can use. They will offer different advantages and their perf/memory cost will be different. Probably there will be a very high level library, which offers generic interfaces at high perf/memory costs, and another utility based library which just offers useful utility functions. The primary aim for JerryScript is that it must work well on a system with 64K RAM and 100MHz 32 bit CPU. That does not offer much space to generic hash tables, especially if the memory is highly fragmented. |
However, when you have 512K or more memory and the CPU is 600MHz, the story is different. |
Probably shared interfaces could help this. Think about malloc. We have several malloc implementations, TCmalloc, JEmalloc, libc malloc, etc, and you can select the best according to your needs. E.g. TCmalloc is faster but use more memory than others, etc. However, they still offer the same malloc/free/realloc interface. Probably what we should do is standardizing certain interfaces, but their implementation could be replaceable. The recommended interfaces could go into a common interface library. |
What about if instead of a Is that really so heavy? In fact, internally, jerry_set_object_native_handle(global_context.public_context,
(uintptr_t)&global_context, NULL); and you could document that if a caller ever did that with the context object the behaviour would be undefined (because they would overwrite internal JerryScript state). |
Wait a sec, I just realized you need a context to do Anyway, instead of a |
There won't be that many contexts lying around, right? Just, like, one per thread, so it's not like you're creating thousands of new objects here. |
Re. @gabrielschulhof 's idea of using a JS object that is non-reachable from the JS code itself: that would add quite a bit of overhead in terms of runtime speed to access the state. How about something like:
|
I don't mind creating a local object. It could be created at the first time when its value is requested. I have a question though: the object properties are accessed by strings, what will guarantee that different modules use different strings? The address of a static variable is more unique. |
The whole point so far was that we don't know the number of modules at compile time. What happens if a module is loaded/unloaded? Unused modules should not be loaded, that consumes a lot of memory. |
OK, well you could have something like:
I think as a project developer you do know the maximum possible number at compile time. I'm not considering loading dynamic libs or anything like that, I think that's probably out of scope ;) However, a module developer does not know what other modules will be included (both compile+run time) by the project developer. |
Name clashes with strings are easy to resolve. See DNS or npm. It's your project name. Or you can use uuidgen :) |
Why? In iot.js only those modules are loaded which are actually used to save memory. |
Also, the project developer chooses which modules to bundle so I don't think they're likely to bundle two modules woth an identical name. |
@gabrielschulhof re. "set a value" (the binding's state): happens in I guess I didn't show using the state. I'll edit the earlier sketch to show this as well. |
Oh, right ... I see. |
Yes, but... After thinking about it a bit longer: I think it's worthwhile assigning an index for each module (at boot, "registration time"). That would make the look-up of the binding's state fast and simple vs. having to look up by address (need a data structure more complicated than a simple array to make that possible). I think it makes sense to optimize for fast look-up because that is what is likely to happen a lot. If we don't like...
... we could also make that array dynamic and realloc when needed, but I suspect most project developers will know at compile time what the max number of modules is. |
Yeah, in Zephyr.js, before going with the linked list I also thought about making an array of these module definitions. In JerryScript a run-time fixed-size, compile-time configurable array totally makes sense. |
Then, I guess each context will have an array where the indices are the same as what the modules receive at startup and the values are the module state. |
I don't really see a theoretical difference between a compile time defined array and a single pointer. Both has the same limitations. Based on this conversion, my proposal is the following: JerryScript side (pseudo code only, the names should be better of course)
In the JerryScript utility library, there would be an interface:
All modules must use this interface. There could be several implementations: fixed array, chain list, hashmap, and the developer chooses the appropriate one depending on the use case. However the modules must not depend on any implementation, only the interface. |
@zherczeg I agree with you except for 2 minor questions. Just to make sure.
|
I just realized that a module cannot store the module id, since it needs a storage. Hence we need a unique pointer (id). Also it would be a good idea to specify the size of the module specific area, since the manager could allocate a single area for a header and a module specific space (useful for a chain list or hashmap based implementation). The modified interface:
I think "standard module" could be defined somewhere, and each standard module must follow these guidelines (this is independent from JerryScript). Non-standard modules can do whatever they want, and the system developers must handle them. |
Another idea: id could be a static string, and it should be the name of the module, so the name of the active modules could be printed for standard modules. This is just an idea, we could drop it if you don't like it. Also |
Given an expected number of modules (JERRY_CONTEXT_MODULE_COUNT) which can be defined at compile time, reserve two pointers in each context for module-specific data: one void * for the data, and a function pointer for the deleter to be called when the context is discarded. The modules receive an index into the data array when they register. They can later use the index to store data in the context. Re jerryscript-project#1717
Given an expected number of modules (JERRY_CONTEXT_MODULE_COUNT) which can be defined at compile time, reserve two pointers in each context for module-specific data: one void * for the data, and a function pointer for the deleter to be called when the context is discarded. The modules receive an index into the data array when they register. They can later use the index to store data in the context. Re jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
@zherczeg Questions about your proposal:
I like your idea of having a "standard" interface that supports multiple implementations that have different design goals. Suggestion: I'd rather roll up
|
It is up to you. I think you have more experience with modules, and for me this is kind of outside the scope of jerryscript core, so I am open to all ideas. I also like your struct based register idea. The only thing I don't want is adding too much module specific stuff to JerryScript core, which is a JS engine not a NodeJS like system. I don't mind if we create a new top level directory in jerryscript, e.g. jerry-utilites, and add a module directory there (jerry-utilities/module), where the documentation and source code of standard modules could be placed. I also don't mind adding compilation support to these utilities to the core build system. |
Given an expected number of modules (JERRY_CONTEXT_MODULE_COUNT) which can be defined at compile time, reserve two pointers in each context for module-specific data: one void * for the data, and a function pointer for the deleter to be called when the context is discarded. The modules receive an index into the data array when they register. They can later use the index to store data in the context. Re jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
After having spoken with @martijnthe we believe we have identified the minimum necessary support from core: typedef void *(*jerry_user_context_init_cb)(void);
typedef void (*jerry_user_context_deinit_cb)(void *context);
typedef struct {
// ... existing stuff
void *user_context;
jerry_user_context_deinit_cb deinit_cb;
} jerry_context_t;
// init_cb is called before jerry_init_with_user_context() returns
// (so no need to save in jerry_context_t)
void jerry_init_with_user_context (jerry_init_flag_t flags,
jerry_user_context_init_cb init_cb,
jerry_user_context_deinit_cb deinit_cb);
void *jerry_get_user_context(void); This allows each context to be initialized with a custom "user context", which is allocated and stored during context initialization. It cannot be re-initialized during the lifetime of the context, and it will be deallocated using the provided deleter upon context disposal. The original void jerry_init (jerry_init_flag_t flags) {
jerry_init_with_user_context (flags, NULL, NULL);
} I will make a separate PR to this effect, affecting only core. I will then render the module utility PR as a second modification that requires this change (and that will likely use #1725 for testing). |
Note that by design, we did not provide a "setter", but instead an |
Exactly, and we want the |
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Zidong Jiang [email protected]
I like this proposal! +1 from me |
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
LGTM |
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
…-project#1717) This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes jerryscript-project#1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
This modification makes it possible to initialize a context in such a way that a `void *` pointer is stored inside the context and is made available via a new `jerry_get_user_context()` API. The pointer is initialized via a new `jerry_init_with_user_context()` API, which calls the existing `jerry_init()`, after which it sets the value of the new `user_context` element in the `jerry_context_t` structure using the context allocation callback provided as the second parameter to the new `jerry_init_with_user_context()` API. The location of the cleanup function responsible for deallocating the pointer created by the context allocation callback is provided as the third parameter. This location is stored in the context along with the pointer itself. When a context is discarded via `jerry_cleanup()`, the user context cleanup function is called to dispose of the pointer stored within the context. The semantics behind the API are such that it is now possible to choose for each context an agent which manages arbitrary user data keyed to the given context. The agent must be chosen at context instantiation time and cannot be changed afterwards, remaining in effect for the lifetime of the context. Fixes #1717 JerryScript-DCO-1.0-Signed-off-by: Gabriel Schulhof [email protected]
A while ago,
JERRY_CONTEXT
define andjerry_context_t
had been added.Before that change, all the fields in
jerry_context_t
werestatic
s that were sprinkled throughout the JerryScript code base. This wasn't great for projects where multiple instances of JerryScript are needed, becausestatic
s are by definition "singleton".Luckily, the addition of the
JERRY_CONTEXT
define made it easier to create multiple instances of JerryScript. Typically, to support multiple instances, one would provide a custom definition ofJERRY_CONTEXT
, for example to look up the currentjerry_context_t
in thread local storage.However, often, binding code also needs to keep state regarding the native resources that it uses. In a multi-context scenario, each instance of JerryScript needs to have its own "instance" of this binding state as well. Therefore, using
static
variables cannot be used. Again, becausestatic
equals "singleton".Using
__thread
or a platform-specific thread local storage (TLS) API could be used of course, but this would make the binding code less portable / re-usable.Therefore, I think it make sense to try to come up with an API to associate binding state with a JerryScript context in an easy way that promotes re-use and portability of the binding code.
Some design goals that I have in mind for this an API to associate binding state with a JerryScript context:
__thread
or a platform specific TLS API.jcontext.[h|c]
).static
, but hopefully close?)I imagining something like a
void *user_binding_state[]
array inJERRY_CONTEXT
, plus a couple ofjerry_...
functions to register/allocate/get/free the state for a particular binding. But I'm not quite sure how such a solution would "scale down".Thoughts?
The text was updated successfully, but these errors were encountered: