-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduces write/read and correct stream type transforming #33
Conversation
4aecd56
to
1ca5621
Compare
I think you meant flatmap()? batch() works the other way round, it takes 1 or more input chunks and produces 1 output chunk. |
Yes, thanks for pointing that out, corrected 👍 |
could we extract at least commit "Simplify IFCA drain resolving; extract ProcessingQueue" from this PR? From what I understand it stands on its own and it would make reviewing the rest easier. |
1ca5621
to
381ca18
Compare
You got me there 😄 🙈 I've extracted changes related to mentioned commit to a separate PR. |
After we talked F2F with @jan-warchol, we have concluded that |
There is still one inconsistency with types that I see. Each map<NEW_OUT, ARGS extends any[] = []>(
callback: TransformFunction<OUT, NEW_OUT, ARGS>,
...args: ARGS
): DataStream<IN, NEW_OUT> {
if (args?.length) {
this.ifca.addTransform(this.injectArgsToCallback<NEW_OUT, typeof args>(callback, args));
} else {
this.ifca.addTransform(callback);
}
return this.createChildStream<NEW_OUT>();
} Since we use mutation to change IFCA type, it simply works because this is the same instance underneath and also types are correct. But TBH, to make it code really correct (in the context of generic types) it should be something like: map<NEW_OUT, ARGS extends any[] = []>(
callback: TransformFunction<OUT, NEW_OUT, ARGS>,
...args: ARGS
): DataStream<IN, NEW_OUT> {
let newIfca: IFCA<IN, NEW_OUT, any>;
if (args?.length) {
newIfca = this.ifca.addTransform(this.injectArgsToCallback<NEW_OUT, typeof args>(callback, args));
} else {
newIfca = this.ifca.addTransform(callback);
}
this.ifcaChain.swap(newIfca); // swap previous IFCA with newly type one
return this.createChildStream<NEW_OUT>();
} But after transpiling to JS, it would be changing Any ideas? 🤔 |
Regarding the latest comment - #33 (comment), I have slightly modified I think, this irons out the last inconsistency that was there regarding to types changes handling. |
70b1ee3
to
ddcd9cb
Compare
The 'IFCAChain' class is an additional layer between stream instances and IFCA instances. It allows for easier operating on IFCA instances by related stream instances and also sharing such IFCA instances between multiple stream instances. It was introduced to allow writing to any intermediatte stream instance created during transforms and to allow chaining and sharing IFCAs between such instances.
Utilize 'IFCAChain' so interemediatte streams with shared IFCA can be created easier. It also allows writing/reading on any intermediatte stream to use correct IFCA instance - first in the chain for writing, last in the chain for reading. Apart from that, new stream instance is always created for each transform which allows to manage readability/writiability (and similar) separately.
b161400
to
664b286
Compare
Rebased onto latest |
This PR covers two closely related features:
.write()
and.read()
methods.This PR is based on
task/simplify-ifca-drain
branch which does a bit of a cleanup in IFCA class so you may want to look there as well.Also, please accept my apologies for adding multiple changes at once (in single commits and this PR), but since the whole PR is logically a single thing it was really hard to separate into smaller, reasonable pieces.
Write, read and related methods
Assumptions
The
.write()
and.read()
methods are for more manual stream data control, they allow writing and reading directly to/from any given stream. This also means there should be an ability to end the stream manually and control it's flow (pause/resume). And so also.end()
,.pause()
and.resume()
methods were introduced.When it comes to public API one of the most important things to consider is when one can write/read to/from a stream instance. As transform functions (like
.map()
,.filter()
,.batch()
, etc) always returns a new instance of a stream, it's important to know which stream instance can be written to and to which instance one can write.After some brainstorming the assumptions are as below. Considering the following code:
We have concluded that:
stream1
,stream2
,stream3
,stream4
). This effectively writes to entire transformations chain so value will always got through all the transforms.stream1
,stream2
,stream3
,stream4
). This will have effect on entire streams chain.stream4
). Reading from other instances should throw an error.Implementation
The
.write()
and.read()
methods are just a proxy to the same methods from stream internal IFCA instance and so exposing those methods were required. Since each written value should go through all the transforms, .write()
always needs to write to first stream IFCA (entry point for all transformations) and.read()
should read from the last stream IFCA (exit point after all transformations are applied). This required introducing some glueing mechanisms between related streams (and its internal IFCAs) and thusIFCAChain
was introduced (see below).Also since we would like to have a full control over each stream instance created through transforms, each transform returns new instances of a stream. Before, we would mutate instances if possible, however since it is the same object underneath it resulted in read/wrote permissions inconsistencies.
Transforming streams and its IN/OUT types
Assumptions
As basic stream class (and some derived will be too) is a generic one (
DataStream<IN, OUT>
) it means it consumes only values ofIN
type and produces values ofOUT
type only. This means any chainable transform (which returns instance ofDataStream
) like.map()
or.batch()
changesOUT
type of this returned stream. This requires returning new instance,Implementation
First of all, are transform methods were changed to always return a new stream instance. This is done through helper method
.createChildStream()
which allows subclasses to override it and have full control what type of instance is created (see #28).Each transform creates a new stream instance, but internally the transform chain may be the same - it may be a single IFCA instance. OTOH ordering transforms (like
.batch()
of.flatMap()
) requires creating new IFCA instance. And so multiple transforms may result in just a single IFCA internally or multiple connected IFCAs. This is again handled byIFCAChain
class which manages internal IFCA connections.Let's get back to transform types for a second. When it comes to input/output of a single transform:
.map()
).filter()
)flatMap()
)And transforms also may need to receive chunks in input order or not, so we have:
.map()
).batch()
)At the current implementation, IFCA internally can handle 1, 2, and 4. 3 and 5 needs to be handled outside of IFCA instance. This means that streams created by transform methods of type 1, 2, 4 will internally use the same IFCA instance and of type 3 and 5 will need to create new IFCA instance. This is important to understand new types of IFCA that are created and chained by 3 and 5 transform types. For example,
.flatMap()
:does
this.ifcaChain.add<NEW_OUT, NEW_OUT>(this.options);
. Such methods will always create new IFCAs which initially have bothIN
andOUT
type the same. This is related to a fact that its transform function is run outside of IFCA, so it writes its result to new IFCA instance (which is then used by newly created stream instance).IFCA Chain
As mentioned above,
IFCAChain
is kind of a glue which is used to solve both things described above. As for how stream internal IFCA are changed through transforms, see 9254854.Other changes
StringStream.batch()
so it now returnsDataStream<string[]>
instance (see 4aecd56#diff-d86d1c3ae4fe9c1daa1fa9f644d42e20df0a85c86c3796da3fffcbf16f243eb9R141).