-
Notifications
You must be signed in to change notification settings - Fork 7
Add RFC on data serialization/deserialization #8
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far. Could we add perma-links to a few places where we do this today in open source repos?
text/0000-serialize-deserialize.md
Outdated
SharedState: SharedStateToken, | ||
}, | ||
middleware: ({SharedState}) => { | ||
const {serialize, deserialize} = SharedState('some-id'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to specify a string id here? Wondering if we could generate one instead. Maybe even use a Token from createToken() that could specify the shape of the data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a string, especially to make the mocking experience good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worth adding a strawman example of what the testing experience would look like using tokens instead of string ids to the alternatives section. I agree mocking will probably be easier with strings than working with tokens, but maybe it's possible to have a token-based API that's still pretty good?
I'd feel sad adding string ids as part of this API especially considering how much effort/time was put into making a DI system that didn't rely on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have the id be configured via a token, I just feel like it is kinda overkill. The plugin can export a UUID if you really want it to be guaranteed unique. FWIW, we are using string ids today.
Here is an example of what it would look like configuring the string as a token:
const MyPluginSharedStateIDToken = createToken('MyPluginSharedState');
createPlugin({
deps: {
SharedState: SharedStateToken,
SharedStateID: MyPluginSharedStateIDToken
},
middleware({SharedState, SharedStateID}) {
const {serialize, deserialize} = SharedState(SharedStateID);
}
});
// ... in main.js
app.register(MyPluginSharedStateIDToken, 'some-id');
app.register(SharedStateToken, MockSharedState({
'some-id': mockdata
});
This works, but honestly I think it is too much boilerplate as the number of apps which need serialization grows. I would prefer a pattern where the plugin exports the serialization id.
// In fusion-plugin-redux-react
export const SerializationID = '@fusion-plugin-redux-react/serialization-id'
// In test code
import {SerializationID} from 'fusion-plugin-redux-react';
app.register(SharedStateToken, MockSharedState({
[SerializationID]: mockdata
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String IDs just feel off to me when we have tokens. Tokens feel like the right tool to me here to avoid naming collisions. Are we not concerned about possible naming collisions at all?
Here's one example of an API I was playing with, still on the fence, but it's using tokens, thoughts?
// ... in main.js
app.enhance(SharedStateToken, SharedState => SharedState(MyPluginSharedStateToken, shareddata));
// In test code
app.enhance(SharedStateToken, SharedState => SharedState(MyPluginSharedStateToken, mockdata));
// In plugin:
const {serialize, deserialize} = SharedState.extract(MyPluginSharedStateToken);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is pretty safe to use strings for this type of thing, especially since we are doing it today already with no problem. The string could be prefixed with the name of the npm package for example.
'some-id': { | ||
some: 'json' | ||
}, | ||
'other-data': {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work with custom serialization? For example the immutable case, would transit.toJSON
be used directly, or should somehow the actual ImmutableSharedState
serializer be used? What does this look like when multiple serializers are used (w/ aliasing like you have in the example below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't actually change the serialization, only the deserialization. So whether it is custom or default, this will return whatever the value is in the map. You could put an immutable-js object here directly
text/0000-serialize-deserialize.md
Outdated
return transit.fromJSON(document.getElementById(id).textContent); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example, wouldn't it be better to just use the standard JSON serializer/deserializer (by using the the serialized immutable-transit
strings as values) rather than a custom out-of-band serialization/deserialization?
I can't think of anything that would need for true custom serialization/deserialization aside from CSS-in-JS implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand what you mean here. Could you elaborate, maybe with an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the actual HTML part of the serialization and standardization can be entirely kept in core and not exposed. Custom deserializers/serializers need not involve HTML at all, just A → string
and string → A
functions, and the core SSR plugin itself can handle the HTML part.
So you'd have something sort of like this.
const DefaultSharedState = {
serialize: data => JSON.stringify(data),
deserialize: serializedData => JSON.parse(serializedData)
};
const ImmutableSharedState = {
serialize: data => transit.toJSON(data),
deserialize: serializedData => transit.fromJSON(serializedData)
};
I don't see the need for serialization/deserialization code to be actually concerned with modifying the HTML and/or querying the HTML. The only special case is CSS-in-JS, and in that case, they will want to implement that in a totally custom way rather than using this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true if we limit what you can do with the serialization / deserialization. For example, this would not be possible if we wanted that type of API:
const WindowSharedState = (id) => {
return {
serialize: (ctx, data) => {
ctx.template.head.push(
html`<script nonce=${ctx.nonce}'>window.SOME_DATA=${JSON.stringify(data)}</div>`
);
},
deserialize: () => {
return window.SOME_DATA
}
}
}
As long as we are okay with limiting the types of serialization/deserialization to <script type='application/json>
then I am okay with this. I'm just not quite sure if this limitation will be a problem for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's doubtful that constraint will be an issue in practice. The only non-contrived use case I've been able to come up thus far is CSS-in-JS, but it's unclear to me how mocking state would make sense in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - I can update the RFC to reflect that API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially what might be simpler is all values are always JSON.stringify
/JSON.parse
, and if you want something else you can just serialize/deserialize into a string, which is valid JSON. Although I suppose this might be more overhead.
This can make testing difficult, especially when running browser tests since there is not | ||
an easy way to mock the deserialization other than each plugin exposing their own deserialization | ||
token. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea! I think generally it's good if we move things that do side effects into the DI system (for testing as you mention). Hopefully this would make it easier for more plugins to be totally pure.
That said, for testing, I think long-term, hopefully we can make the puppeteer-style testing story easier so not as much needs to be mocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, a separate issue from this is having puppeteer style tests with access to app
, but still I think this is useful even when we have support for that.
So I assume |
I would rather have no dependencies needed by default. However, since this is in the DI system, it would be easy for an engineer to replace the serialization mechanism with BSON if they want. However, I would say just use timestamps rather than date objects to simplify things |
text/0000-serialize-deserialize.md
Outdated
return (ctx, next) => { | ||
if (__BROWSER__) { | ||
// load and parse serialized data from dom node | ||
const data = deserialize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you do with the resulting data in this case?
How do you make it available to other parts of the app? (either redux state or react context)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would still be done via react context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referencing the fact that data
is assigned to a variable but then nothing is done with it. Can you add examples for redux (I know you have one later down) and React context?
|
||
# Motivation | ||
|
||
It is a very common pattern in fusion to have some shared state between the server and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What types of shared data are we expecting to support? Should you expect to serialize your entire redux state tree? Should you use it for storing preferences? Or how agnostic should it be to your data type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default this will just support JSON serializable data, but since the serializer/deserializer is in the DI system, you can basically add support for any data type you want.
text/0000-serialize-deserialize.md
Outdated
const ImmutableSharedState = (id) => { | ||
return { | ||
serialize: (ctx, data) => { | ||
ctx.template.body.push( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you writing to and reading from HTML directly? I'd rather have the framework abstract this away for me.
I understand you want an escape hatch, but I think the default case should be that the framework sets the return value of this function in the DOM, and then on the browser is responsible for trying to find that element and passing its results as an argument to deserialize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default the framework handles writing to html / dom. It is only in the case where you want custom serialization / deserialization when you need to do this manually.
The reason I think we need the escape hatch to include html/dom APIs is it could be dependent on your serialization mechanism. For example, you might want to serialize it into an inline js script rather than a json script tag.
|
||
There are many variations of this API we should consider. One potential option would be to have a `ctx.serialize` and `ctx.deserialize` function, | ||
and provide a way to mock those functions in testing via something in `fusion-test-utils`. This could solve the issue with testing, but would | ||
require introducing new APIs to `fusion-test-utils`. While the above proposal does introduce new Tokens, it uses the DI system to solve the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this solution instead of overloading ctx further
'some-id': { | ||
some: 'json' | ||
}, | ||
'other-data': {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be deserialize(serialize(mockValue))
or just mockValue
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the mockValue
|
||
```js | ||
import {MockSharedState, SharedStateToken} from 'fusion-core'; | ||
// Deserialize calls will now return the mock values for these ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a couple aspects of this particular explanation aren't totally clear to me
- What's the difference between setting the shared state, setting the default shared state, and mocking the shared state, and overriding the shared state? Is this API only used for testing? What happens when it's used normally? I think hopefully we could avoid using "mock" as part of the API if it's not necessarily testing-specific.
- Does this change which "deserialize" function is called, or merely change the values on which deserialize operates on? Part of this explanation seems to imply that
deserialize
is mocked, but the name of the actual function makes it seem like only the itself data is being changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea open for suggestions on the naming here. I actually originally put the MockSharedState
(or w/e we call it) into fusion-test-utils
because it really should only be used for testing. It is really just a helper for mocking deserialize
. Maybe it could mock serialize
also, but I think that will be less useful. But that comment I think describes it pretty well:
Deserialize calls will now return the mock values for these ids
It does just that.
|
||
This can make testing difficult, especially when running browser tests since there is not | ||
an easy way to mock the deserialization other than each plugin exposing their own deserialization | ||
token. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this solution that bad (i.e. just having a token for each plugin)? I guess it depends on a few factors
- How frequently you want to mock the "universal" states for multiple separate plugins
- How much boilerplate/docs is required
However, I think there are some potential advantages of per-plugin state mock/override tokens:
- Better type safety? I'm not totally sure but it seems like this might be difficult to type
- No need for string IDs (no typos)
- Plugins can't have implicit dependencies on other plugin's shared state (using string ids)
import {FooStateToken} from "fusion-plugin-foo";
import {BarStateToken} from "fusion-plugin-bar";
import {BazStateToken} from "fusion-plugin-baz";
// in tests
app.register(FooStateToken, /* something */);
app.register(BarStateToken, /* something */);
app.register(BazStateToken, /* something */);
I suppose this is more verbose, but could be remedied:
import {FooStateToken} from "fusion-plugin-foo";
import {BarStateToken} from "fusion-plugin-bar";
import {BazStateToken} from "fusion-plugin-baz";
function mockState(app, state) {
app.register(FooStateToken, state.foo);
app.register(BarStateToken, state.bar);
app.register(BazStateToken, state.baz);
}
// in tests
mockState(app, {foo, bar, baz});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think what you are proposing is correct or possible. Maybe you are missing some context from the RFC. The user doesn't provide the data, the plugin provides the data. So the data itself cannot come from the token. Maybe we should have a quick chat IRL to go over the doc
Chatted with @rtsao offline, going to wait on this RFC until we put together a separate RFC for |
I was thinking about making an RFC for "universal env vars" but realized this is basically the same problem. I think we can consider this issue in terms of three degrees of "universal" state:
For deployment-time values (such as env vars) a convenient API might look something like: import {universalEnv} from "fusion-core";
const cdnURL = universalEnv("CDN_URL"); // build-time enforcement that param is string literal I think the takeaway here is consumers of the above API probably shouldn't have to interact with context at all because the values are invariant between requests. That said, it seems like at least part of the underlying implementation should be shared because it's conceptually the same thing, just sort of a simplified case where state doesn't vary between requests. It would be nice to have a unifying "universal transport" abstraction here. Perhaps one way to aligning these is if there was some default value which is context is omitted. Here's an idea loosely inspired by the React.createContext API: import {createUniversalState} from "fusion-core";
const {getState, setState} = createUniversalState("some unique id", "a default value");
getState();
// => "a default value"
// Or within a middleware:
setState(ctx, "some new value");
getState(ctx);
// => "some new value" Some of the implementation details are still a bit fuzzy but I think this would be interesting idea to explore. |
Rendered