[ WIP ] pinole integration, next gen transports #937

davidmarkclements · 2020-12-13T17:03:59Z

For background see #614 and https://github.com/pinojs/pinole

transports that run in worker threads
supports ESM and CJS next gen transports
supports "legacy" transports
supports flushSync

Roll out plan

Part of Pino v7
Supports Node 12 and Node 14
I suggest we release v7 before 10 is at end-of-life, - Node 10 users can stay on Pino 6.

Pino transport API

Package support:

  const logger = pino(pino.transport('my-transport', {some: 'options'})

File path support:

  const logger = pino(pino.transport('./path/to/my/local/transport.js', {some: 'options'})

The options object must be clonable because it's passed to the worker thread - e.g. no functions etc.

Next Gen Transports

Transports may be ESM or CJS. I'm encouraging ESM for transports.

A canonical next gen transport runs in a worker thread and looks as follows:

export default (opts) => {
  return (data, sync) => {
     // do something with data
     // useful sync boolean tells us if the main thread is trying to sync flush
  }
}

For asynchronous initialisation, using ESM you have Top level await but
for any async work based on opts you can make the exported function async:

export default async (opts) => {
  // do something awaiting with values from opts
  return (data, sync) => {
     // do something with data
     // useful sync boolean tells us if the main thread is trying to sync flush
  }
}

For asynchronous writes, make the returned writing function async:

export default async (opts) => {
  return async (data, sync) => {
     // do something with data and await
  }
}

Making the write function async is tolerable to performance constraints because this is
happening off the main thread - the ergonomics of async/await can win here.

Legacy Support

Support for existing transports works like so:

IF the transport is a module
AND IF the package.json has a "bin" field
AND IF NOT the package.json has a "pino" field that doesn't have the value "legacy"

Then it's determined to be legacy. This means the transports that don't work should work automatically. But they can disambiguate by adding "pino": "legacy" to the package.json.

Pinole wraps legacy transports in a next gen transport wrapper and and feeds data into process.stdin while working around the stdout/stderr disconnect race scenario you get with worker threads during the final tick of process exit: see https://github.com/pinojs/pinole/blob/master/lib/legacy.js

This means flushSync works with legacy as well.

Anything that isn't a module must be a next gen transport.

Modules that have a package.json without a "bin" field are interpreted as next gen transports
Modules that have a package.json with a "bin" field but with "pino": "v7" (or really anything that doesn't say legacy) are interpreted as next gen transports

Here's pino-coloda getting the legacy treatment from Pinole - this is pino-colada running inside a worker thread, with no modifications to pino-colada itself:

Approach: Performance & Safety

This approach supports flushSync by blocking the main thread using Atomics which unblocks conditionally when the worker thread updates a element in a SharedArrayBuffer. This means in the final tick (eg process.on('exit', ...)) we can still do asynchronous work in the worker thread while keeping the main thread alive.

During normal operations no Atomics are used, and we purposefully avoid blocking the main thread at all costs. We use the worker thread to poll the SharedArrayBuffer, the only work the main thread does is write into the SharedArrayBuffer (in the same way that sonic-boom writes to a buffer, using TextEncoder) and then update a meta-data section of the SharedArrayBuffer so that the worker thread knows to pass the log line(s) to the transport.

Todo

Lot's more to do but I wanted to get eyes on this sooner rather than later

docs
benchmarking (thinking req/s of an http server)
there's some no-op methods in pinole (flush, end, etc.) that could probably benefit from some implementation details, @mcollina advice appreciated
100% testing for pinole (not part of this PR but blocking on it)
probably a lot of edge cases and bug fixing in pinole @mcollina it could really use your eyes as well
wondering if we should default to using a worker thread for normal stdout writes (depending on benchmarks) UNLESS os.cpus().length < 1 (or <2?)
Node 12 support (tests seems to not work on CI at least)
Windows support (tests seems to not work on CI at least)

mcollina · 2020-12-13T17:23:29Z

I would err for a more incremental approach and keep the current behavior as default - do not send data to a worker thread unless asked to.

+1 for the great work.

jsumners · 2020-12-13T17:36:08Z

I would err for a more incremental approach and keep the current behavior as default - do not send data to a worker thread unless asked to.

Agreed.

But other than that...

davidmarkclements · 2020-12-13T22:06:10Z

I would err for a more incremental approach and keep the current behavior as default - do not send data to a worker thread unless asked to.

+1 for the great work.

This is the current approach - if no transport option is passed to Pinole it returns a SonicBoom instance

Transports always go in a worker thread - destinations are always written to in the main thread.

This has me thinking though - if there’s only 1 cpu it’s probably faster to run the transport in the main thread which we can also do

I’m also thinking of supporting multiple transports/destinations in pino next - just allow an array of transports and/or destinations to be passed as the stream option. Thoughts on that?

jsumners · 2020-12-14T02:43:32Z

I’m also thinking of supporting multiple transports/destinations in pino next - just allow an array of transports and/or destinations to be passed as the stream option. Thoughts on that?

😬

Not sure. I'd love to deprecate pino-multi-stream. But I think perpetuating it with this will be an even bigger headache. Still, we know people are going to ask for it.

mcollina · 2020-12-14T08:16:13Z

We should really embed pino-multi-stream if we went that route.

davidmarkclements · 2020-12-15T02:07:23Z

We should really embed pino-multi-stream if we went that route.

ok let's get this done and look at that separately

mcollina · 2020-12-15T10:17:16Z

This still needs:

github actions update to remove node 10 and 13
fixup for failing tests
docs

jmealo · 2021-03-20T23:24:16Z

@davidmarkclements this looks awesome! Where we at with this?

mcollina · 2021-03-20T23:51:28Z

We are reworking the worker thread data transport as there were a few cases of data loss. You can follow our progress on that at https://www.npmjs.com/package/thread-stream.

mcollina · 2021-04-06T17:03:23Z

@davidmarkclements I'm not sure we can safely implement this concept of transform as pure functions. We need streams.
I consider https://github.com/pinojs/pino-elasticsearch/blob/25b400ccc847ef1bbb2336cd8c05efa216d692dd/lib.js#L8 a good example on how to implement a transport, ideally that should work well with the next gen transports as well.

davidmarkclements · 2021-04-08T14:50:08Z

@mcollina and I had a call about this.

Pinole is deprecated, @mcollina reimplemented the concept as a thread-stream
the pure function debate is still on the table, if we can implement it on top of the streams concept Matteo has put together - the benefit is line splitting and parsing in core, in the worker thread - so that this doesn't have to be reimplemented in every transport over and over again
the legacy transport wrapping concept isn't worth the hassle - the ways we would do it would touch parts of Node core that aren't really for touching. The delta on migrating most of our supported transports is small, we just have to export a stream from each transport and then it's compatible

I'm going to close this PR now, in favour of #1003 and #1004

github-actions · 2022-11-21T00:05:06Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

pinole integration, next gen transports

8fb2ff7

davidmarkclements added the v7 label Dec 13, 2020

davidmarkclements requested review from mcollina and jsumners December 13, 2020 17:03

davidmarkclements mentioned this pull request Dec 13, 2020

transports conversation starter [ RFC ] [ DO NOT MERGE ] #614

Closed

jsumners approved these changes Dec 13, 2020

View reviewed changes

This was referenced Apr 7, 2021

pino.transport() #1003

Merged

Add pino.multistream #1004

Merged

davidmarkclements closed this Apr 8, 2021

jsumners deleted the next-gen-transports branch November 20, 2021 18:04

github-actions bot locked as resolved and limited conversation to collaborators Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ WIP ] pinole integration, next gen transports #937

[ WIP ] pinole integration, next gen transports #937

davidmarkclements commented Dec 13, 2020 •

edited

Loading

mcollina commented Dec 13, 2020

jsumners commented Dec 13, 2020

davidmarkclements commented Dec 13, 2020

jsumners commented Dec 14, 2020

mcollina commented Dec 14, 2020

davidmarkclements commented Dec 15, 2020

mcollina commented Dec 15, 2020

jmealo commented Mar 20, 2021

mcollina commented Mar 20, 2021

mcollina commented Apr 6, 2021

davidmarkclements commented Apr 8, 2021

github-actions bot commented Nov 21, 2022

[ WIP ] pinole integration, next gen transports #937

[ WIP ] pinole integration, next gen transports #937

Conversation

davidmarkclements commented Dec 13, 2020 • edited Loading

Roll out plan

Pino transport API

Next Gen Transports

Legacy Support

Approach: Performance & Safety

Todo

mcollina commented Dec 13, 2020

jsumners commented Dec 13, 2020

davidmarkclements commented Dec 13, 2020

jsumners commented Dec 14, 2020

mcollina commented Dec 14, 2020

davidmarkclements commented Dec 15, 2020

mcollina commented Dec 15, 2020

jmealo commented Mar 20, 2021

mcollina commented Mar 20, 2021

mcollina commented Apr 6, 2021

davidmarkclements commented Apr 8, 2021

github-actions bot commented Nov 21, 2022

davidmarkclements commented Dec 13, 2020 •

edited

Loading