-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugins Epic #1896
Comments
I really like it! One early 'internal' consumer for this could be session recording, but this would require:
However now I'm thinking on this more deeply, I think point 2 would cause issues - on cloud, we would need to do a lot of work to limit access to only your own organizations data. I think instead of relying on plugins, let's roll it into main repo and extract as a plugin as it evolves. Thoughts? |
I believe scheduled tasks and access to models will be easy to add. I just wanted to get all the rest solid first. Regarding API access, you can literally now inside a plugin do: from posthog.models import Event
events = Event.objects.filter(team_id=config.team) ... and then do whatever you need. It's just not wise to do these queries inside the process_event block though. So we need scheduled tasks. This was basically already handled during the hackathon by just passing the celery object to the plugin, giving the plugin the opportunity to register tasks, but I removed that for now. Obviously plugin authors will need to be careful to scope their queries to a team, like we do now in the main codebase. This will be up to plugin authors to handle though... :/ And we won't run unknown plugins on app, so this shouldn't really be an issue. |
No custom plugins on cloud right? If we're exposing our models then I think we should do another refactoring first as well: either rename our root module in this repo due to the conflict with https://github.com/PostHog/posthog-python. It's more than conceivable that users would love access to both without needing to hack around it. |
I think this could be a great use case for a plugin and a nice example for others to follow when making their own retention style plugins. That said, feel free to start coding this inside app and we can extract later. |
I'm so excited by this, but I think we need to think about ensuring adoption. Broadening the plugins' appealThe range of what you can do is severely limited at the moment. Opening up all models would make plugins far more versatile. Improving development process
Whilst making a plugin is simple, for someone outside our core team who isn't already doing local development, I don't think it's trivial - they would need to deploy PostHog locally, manually = ~12 commands. The advantage of making this entire process end to end trivial is that we'll get more people in the community building plugins. This would be a strategic benefit as it'll make us achieve platform status. A few thoughts on improving this - although I am very open to alternative ideas, as I'm not really the target audience:
SecurityCould we automatically filter all queries by team for any plugin, somehow? It feels like relying on people to add their own appropriate team filters is unrealistic. |
Here's another thing to consider. Plugins are currently exported as a class with the following signature: # exampleplugin/__init__.py
from posthog.plugins import PluginBaseClass, PosthogEvent, TeamPlugin
class ExamplePlugin(PluginBaseClass):
def __init__(self, team_plugin_config: TeamPlugin):
super().__init__(team_plugin_config)
# other per-team init code
def process_event(self, event: PosthogEvent):
event.properties["hello"] = "world"
return event
def process_identify(self, event: PosthogEvent):
pass
def process_identify(self, event: PosthogEvent):
pass The classes for these plugins are loaded into python when the app starts (or a reload is triggered). This means that in a large app with multiple teams, we can have thousands if not more copies of the same object loaded in memory. For example, if we load a 62MB IP database with every initialization of the maxmind plugin for each team, with a thousand teams we'll need 62GB of RAM. Thus it must be possible for plugins to share state per app instance and thus they need some Here are two ideas to solve this. Option 1. Functional shared-nothing style:# maxmindplugin/__init__.py
import geoip2
from posthog.plugins import PluginBaseClass, PosthogEvent, TeamPluginConfig
from typing import Dict, Any
def instance_init(global_config: Dict[str, Any]):
geoip_path = global_config.get("geoip_path", None)
reader = None
if geoip_path:
reader = geoip2.database.Reader(geoip_path)
else:
print("🔻 Running posthog-maxmind-plugin without the 'geoip_path' config variable")
print("🔺 No GeoIP data will be ingested!")
return {
"config": global_config,
"reader": reader
}
# # Not used for this plugin
# def team_init(team_config: TeamPluginConfig, instance: Dict[str, Any]):
# return {
# "config": team_config.config
# "cache": team_config.cache,
# "team": team_config.team,
# }
def process_event(self, event=PosthogEvent, team_config=TeamPluginConfig, instance_config=Dict[str, any]):
if instance_config.get('reader', None) and event.ip:
try:
response = instance_config['reader'].reader.city(event.ip)
event.properties['$country_name'] = response.country.name
except:
pass
return event
def process_identify(self, event: PosthogEvent, team_config=TeamPluginConfig, instance_config=Dict[str, any]):
pass
def process_identify(self, event: PosthogEvent, team_config=TeamPluginConfig, instance_config=Dict[str, any]):
pass I'm not set on the naming of things... nor on the exact shape of dicts/objects returned from each function, so please ignore that (and share feedback if you have it). The point is this being a "serverless" or "functional" shared nothing style approach. We would call the instance_init or team_init functions as needed and pass the objects returned to each process_* method. Option 2 - class globalsclass MaxmindPlugin(PluginBaseClass):
@staticmethod
def init_instance(global_config: Dict[str, Any]):
geoip_path = global_config.get("geoip_path", None)
if geoip_path:
MaxmindPlugin.reader = geoip2.database.Reader(geoip_path)
else:
print("🔻 Running posthog-maxmind-plugin without the 'geoip_path' config variable")
print("🔺 No GeoIP data will be ingested!")
MaxmindPlugin.reader = None
def init_team(self, team_config):
pass
def process_event(self, event: PosthogEvent):
if MaxmindPlugin.reader and event.ip:
try:
response = MaxmindPlugin.reader.city(event.ip)
event.properties['$country_name'] = response.country.name
except:
# ip not in the database
pass
return event Here the same class would have two methods, one static In this scenario, we would still init a new class per team per plugin, but with a much smaller payload. Which option do you prefer? 1 or 2? |
I went with option 2 for now. Also, I made a small TODO list. Bigger features:
Dev Experience:
UX:
Safety:
Docs:
Sample plugins:
|
For those following along, experimenting with plugins on Heroku, I have run across a new and unexpected issue! The PUBSUB worker reload code creates too many connections to Redis, making the app unusable on Heroku with the free redis instance. Celery is consistenely running into Unrelated, the worker is also constantly running out of memory and starts using swap: The explanation is that celery forks a new worker for each CPU core it finds. In the $7/mo heroku hobby dynos, 8 CPUs are reported: ... thus taking up (1+8) * 70MB of RAM and an additional 1+8 celery connections for the plugin reload PUBSUB. On another branch preview, without the plugin reload pubsub, 12-19 redis connections are already used, making the extra 9 clearly exceed the limit: Bumping the redis addon to one with 40 connections, I see that 28 are used. In addition to all of this, there seems to be some issue reloading plugins in the web dynos: I'll keep investigating, though it seems it might be smart to ditch the pubsub for plugin reloads and just use a regular polling mechanism... though I need to test this. Alternatively, it might be wiser to hoist the reload up from per-fork to per-worker, putting it basically into ./bin/start-worker and reloading the entire process once a reload takes place. |
Hello! Gallery of failed attemptsSince I last posted, the following has happened:
Plugins via Node-CelerySince we're already using celery, it just made a lot of sense to use the existing infrastructure and pipe all events though celery. It works beautifully! 🤩 To enable, set You might also need to run In case the plugins server is down, events will just queue up and hopefully nothing is lost. Plugin reloads are done via a redis pubsub system, triggered by the app. Plugin formatTo install a plugin all you need is a github repo with an // plugin.json
{
"name": "helloworldplugin",
"url": "https://github.com/PosthHog/helloworldplugin",
"description": "Greet the World and Foo a Bar, JS edition!",
"main": "index.js",
"lib": "lib.js",
"config": {
"bar": {
"name": "What's in the bar?",
"type": "string",
"default": "baz",
"required": false
}
}
} The The // lib.js
function lib_function (number) {
return number * 2;
} This function is now available in Here's what you can do in the plugin's // index.js
async function setupTeam({ config }) {
console.log("Setting up the team!")
console.log(config)
}
async function processEvent(event, { config }) {
const counter = await cache.get('counter', 0)
cache.set('counter', counter + 1)
if (event.properties) {
event.properties['hello'] = 'world'
event.properties['bar'] = config.bar
event.properties['$counter'] = counter
event.properties['lib_number'] = lib_function(3)
}
return event
} The The Inside these JS files you can run the following:
There's still a lot of work to do to clean this up even further, though what is now in the Next stepsHere are some todo items to :
Even further steps:
|
Noting down another plugin idea: tracking how many times a library has been installed. This should again help make product decisions (e.g. which to add autocapture to: flutter vs react-native vs ios). |
New stuff! On all self-hosted installations (no feature flag needed & multi tenancy excluded), when you load plugins from "project -> plugins", you're greeted with this page: It has two features:
Once enabled per team, in This task will be picked up by the node worker via celery. After running the event through all relevant plugins for the team, it sends a new There's also a much much much nicer interface to install and configure the plugins (thank you @paolodamico !!): There are a few rough edges (no upgrades, only string fields), but it as a first beta it gets the job done. If there's an error in any plugin, either during initialisation or when processing an event, you can also see the error together with the event that broke it: And when you decide you have had enough, just disable the plugin system and all events pass through celery as normal: |
Jotting down some recommendations for the next iteration. The error thing is pretty cool, some suggestions to improve this:
|
Hey @paolodamico , totally agree with the suggestions and we should make this much better. For now, there's at least something. |
Master plan with plugins:
I'm sure I forgot some things, but this is basically what we're looking at. This is turning out to be a long hackathon 😅 |
Tasks regarding plugins are now tracked in this project |
A few thoughts on stuff that would help these launch successfully:
Depending on your reaction to above, perhaps we should clarify on the project what is a blocker to launching? |
Over the last few days plugins have gotten decidedly more exciting. When PR #2743 lands (and PostHog/plugin-server#67), we will support:
Both features have their gotchas and are excitingly beta, yet, excitingly, they work well enough for a lot of use cases. Check it while it lasts. The Heroku Review App for this branch contains a few fun plugins. 1. The "github metric sync" plugin. Not yet the full stargazers sync, but just syncing the number of stars/issues/forks/watchers as a property every minute: Screenshot: Code: async function runEveryMinute({ config }) {
const url = `https://api.github.com/repos/PostHog/posthog`
const response = await fetch(url)
const metrics = await response.json()
posthog.capture('github metrics', {
stars: metrics.stargazers_count,
open_issues: metrics.open_issues_count,
forks: metrics.forks_count,
subscribers: metrics.subscribers_count
})
} All events captured in a plugin via We can graph this. Our star count is steady! 2. The "Queue Latency Plugin" This is a pretty quirky usecase. // scheduled task that is called once per minute
function runEveryMinute() {
posthog.capture('latency test', {
emit_time: new Date().getTime()
})
}
// run on every incoming event
function processEvent(event) {
if (event.event === 'latency test') {
event.properties.latency_ms = new Date().getTime() - event.properties.emit_time
}
return event
} Since the event is Using PostHog to measure PostHog. 🤯 Github star sync plugin I started making a true github star sync plugin, but still have two blockers that need to be solved separately.
Even with these blockers, the plugin is currently possible. Snowflake/BigQuery plugin Segment in their functions exposes a bunch of node packages to the user: With the Other things to improve There are so many things that can be improved. Browse the Heroku app and write the first 5 you find. Here are some random ones:
This is BETA Plugins are, while legitimately powerful, are still legitimately beta. The next step is to get this running on cloud and get the snowflake and bigquery plugins out. |
Here it is 🥁 🥁 🥁 the async function runEveryMinute({ cache }) {
// if github gave use a rate limit error, wait a few minutes
const rateLimitWait = await cache.get('rateLimitWait', false)
if (rateLimitWait) {
return
}
const perPage = 100
const page = await cache.get('page', 1)
// I had to specify the URL like this, since I couldn't read the headers of the original request to get
// the "next" link, in which `posthog/posthog` is replaced with a numeric `id`.
const url = `https://api.github.com/repositories/235901813/stargazers?page=${page}&per_page=${perPage}`
const response = await fetch(url, {
headers: {'Accept': 'application/vnd.github.v3.star+json'}
})
const results = await response.json();
if (results?.message?.includes("rate limit")) {
await cache.set('rateLimitWait', true, 600) // timeout for 10min
return
}
const lastCapturedTime = await cache.get('lastCapturedTime', null)
const dateValue = (dateString) => new Date(dateString).valueOf()
const validResults = lastCapturedTime
? results.filter(r => dateValue(r.starred_at) > dateValue(lastCapturedTime))
: results
const sortedResults = validResults.map(r => r.starred_at).sort()
const newLastCaptureTime = sortedResults[sortedResults.length - 1]
for (const star of validResults) {
posthog.capture('github star!', {
starred_at: star.starred_at,
...star.user,
})
}
if (newLastCaptureTime) {
await cache.set('lastCapturedTime', newLastCaptureTime)
}
if (results.length === perPage) {
await cache.set('page', page + 1)
}
} I would like an option to specify a custom timestamp for my event. Other than that, it works! What's more, it makes only 60 requests per minute, keeping below Github's free usage API rate limits :). |
Pretty exciting updates @mariusandra, thanks for sharing it in such detail! Would like to start writing out a plugin really soon. In the meantime let me know if I can help with the UI/UX to better communicate the new functionality/workflow. |
Cool, looks nice, which external node modules are supported? I assume you need to preinstall and/or white list them? |
There are two ways to include external modules.
Right now only For reference, segment does something similar as well. |
Memory benchmarks! As it is built now, So how heavy is a VM? (Un?)Surprisingly, not at all! A simple plugin VM takes about 200KB of memory. A more complicated plugin (100kb of posthog-maxmind-plugin/dist/index.js) takes about 250KB. Thus running 1000 VMs in parallel consumes an extra 250MB of RAM. Said differently, if 1000 customers on cloud enable one plugin, the server's memory footprint will grow 250MB per worker. Obviously a VM that loads a 70MB database and keeps it in memory throughout its lifetime will consume more memory, but for all intents and purposes VMs are very light. Originally I had imagined a "shared plugin" system for "multi tenancy" (cloud), where we spin up a bunch of shared VMs that can just be enabled/disabled per team. However I could never get over the danger of leaking data. For example when one Now I'm thinking differently. With such a light footprint, we can spin up a new VM for each team that wants to use a plugin, thus completely separating the data in memory. If the number gets more than one CPU core can handle (so over 10k plugins in use?), we can split the work and scale horizontally as needed. For enterprise customers using PostHog Cloud, we could provide an additional worker-level or process-level isolation. This is what cloudflare does - they split the free workers and the paid client's workers into separate clusters. In our case, with thread level isolation on cloud, each paying customer could get their own worker (aka CPU thread) that runs all their plugins. These workers could be automatically spun up and down as the load changes by the plugin server, protecting paying customers from broken and runaway plugins made by other customers. With something this, we could even enable the plugin editor for all paying customers. We're really making a lambda here :). |
It's been 1.5+ months (including the Christmas break) since the last update, so time for a refresher! The big big change that has happened since then is that event ingestion is now handled by the plugin server! This is still beta and disabled by default, but when enabled, events after being processed by the plugins, are ingested (stored in postgres or clickhouse) directly inside the plugin server. For Postgres/Celery (OSS) installations, this avoids one extra step. For ClickHouse/Kafka (EE) installations, this makes using plugins possible, as with this setup we have nowhere to send the event after the plugins have finished their work. The work in the next weeks will be to stabilise this ingestion pipeline and enable it for all customers on PostHog Cloud. Currently we're bottlenecked to ~100 events/sec per server instance (even less for long waiting plugins) and this needs to be bumped significantly. Only after can we enable plugin support for all cloud users. Hopefully next week :). Other notable changes in the last month or so:
All that said, with the launch of plugins on cloud (already enabled for some teams to test), we're entering a new era for the plugin server. From now on we must be really careful not to trip anything up with any change and religiously tests all new code! We also introduced quite a bit of technical debt with the 4000 changed lines of the ingestion PR (all the magic SQL functions, database & type mapping, etc). This needs to be cleaned up eventually. While we've gotten very far already, there are many exciting changes and challenges still to come. For example:
And then we'll get to the big stuff:
|
I think this has evolved in a bunch of different places and can now be closed? @mariusandra |
@paolodamico I think this can indeed be closed, but not before one last update! One last Plugin Server updateIt's been 3.5 months since the last update. Let's check in on our contestants. What we have been building with plugins is something unique... something that in its importance and its value to the bottom line has a legit opportunity overtake all other parts of PostHog (though won't be our defining feature since it's already built). The Plugin Server has turned PostHog into a self-hosted and seriously scalable IFTTT / Zapier / Lambda hybrid, with RDS, ElasticCache, SQS and other higher abstractions baked right in. It has become serious application platform on its own. (Seriously, it has. Check out this 45min talk LTAPSI - Let's Talk About Plugin Server Internals for more) Combine this with a scalable event pipeline, and you can build some really cool shit. Web and product analytics? So 2020. Here are some more exotic ideas:
Oh and PostHog can still do web and app analytics, session recording, heatmaps, feature flags, data lakes, etc, etc ad nauseam :). Current statePlugins now power the entire ingestion pipeline. On PostHog cloud, one plugin server can ingest at most a thousand events per second. Plugins are now used by many self-hosted and cloud customers to augment their data and to export it to various data lakes. We have had several high quality community plugins come in, such as sendgrid and salesforce (should be added to repo?). We've had entreprise customers write their own 700-line plugins to streamline data ingestion. You just need to write the following to have an automatic batched and retry-supported data export plugin import { RetryError } from '@posthog/plugin-scaffold'
export async function exportEvents (events, { global, config }) {
try {
await fetch(`https://${config.host}/e`, {
method: 'POST',
body: JSON.stringify(events),
headers: { 'Content-Type': 'application/json' },
})
} catch (error) {
throw new RetryError() // ask to retry
}
} If you throw the error, we will try running this function again (with exponential backoff) for around 48h before giving up. We now have a bunch of functions you can write: You can export export const jobs = {
performMiracleWithEvent (event, meta) {
console.log('running async! possibly on a different server, YOLO')
}
}
export function processEvent (event, { jobs }) {
jobs.performMiracleWithEvent(event).runIn(3, 'minutes')
event.properties.hello = "world"
return event
} Here's real feedback from a customer that we received (name omitted just in case):
Since the last update 3.5 months ago we have built: injecting loop timeouts via babel, polished CD, implemented a We're only getting started :). Next challengesLook at the team extensibility project board for what we're working on now. There's a lot of ongoing "keep the lights on" work, which will continue to take up most of the time going forward. This work is not exciting enough to mention here (90% of closed PRs the last 4 months for example), but absolutely important to get through. From the big things, there are a few directions we should tackle in parallel:
Only when that's done, we could also look at UI plugins. Let's hold back here for now, as the frontend is changing so rapidly. Instead let's take an Apple-ish approach where we only expose certain elements that are ready, starting with the buttons to trigger jobs and displaying the output in the console logs. Big thing to watch out forI believe the biggest challenge for the plugin server will come in the form of flow control. The job queue next steps issue briefly talks about it. The plugin server has just a limited amount of workers (40 parallel tasks on cloud). Imagine Kafka sending us a huge batch of events, and at the same time receiving a lot of background jobs, and running a few long running To prevent this, there's a "pause/drain" system in place with most queues. We periodically check if piscina is busy, and if so, stops all incoming events/jobs/etc. If we add more features and are not careful regarding flow control, we can run into all sorts of bottlenecks, deadlocks, and lost data. We must be terrified of issues with flow control if we're to build a project for the ages. Related, the redlocked services (schedule, job queue consumer) are now bound to just running on one server. This will not scale either. There must be an intelligent way to share load and service maps between plugin server instances... without re-implementing zookeeper in TypeScript. Last wordsI'll close this issue now, as work on the plugin server is too varied to continue keeping track of in just one place. I sincerely believe that what we have built with the PostHog Plugin Server is something unique, with limitless usecases, for personal and business needs alike. It's especially unique given it's an open source project. Somehow it feels like giving everyone a new car for free. I'm super excited to see what road trips the community will take with it :). |
In order to not pollute the PR with discussion that will be hidden by a thousand commits, I'll describe here what is currently implemented and where do we go from here.
Plugins in PostHog
One of the coolest ideas that came from the PostHog Hackathon was the idea of Plugins: small pieces of code that can be installed inside posthog, providing additional or custom features not found in the main repo.
Two examples of plugins that are already built:
Currently plugins can only modify events as they pass through posthog. Support for scheduled tasks, API access, etc is coming. More on this later.
Installing plugins via the interface
Assuming the following settings are set:
... the following page will show up:
Plugins are installed per-installation and configured per-team. There is currently no fine-grained access control. Either every user on every team will be able to install/configure plugins or not.
When installing plugins or saving the configuration, plugins are automatically reloaded in every copy of the app that's currently running. This is orchestrated with a redis pubsub listener.
Installing plugins via the CLI
Alternatively, you may set the
INSTALL_PLUGINS_FROM_WEB
setting toFalse
and use the posthog-cli to install plugins:Plugins can be installed from a git repository or from a local folder:
Plugins installed via the CLI will be loaded if you restart your posthog instance. They will then be saved in the database just like the plugins installed via the web interface. Removing the plugins from
posthog.json
uninstalls the plugins the next time the server is restarted.In case you use both web and CLI plugins, the settings in posthog.json will take precedence and it will not be possible to uninstall these plugins in the web interface.
As it stands now, it's not possible to configure installed plugins via the CLI. The configuration is still done per team in the web interface.
Creating plugins
It's pretty simple. Just fork helloworldplugin or use the CLI:
Todo for this iteration
PluginBaseClass
and release it as a newposthog-plugins
pip packageFuture ideas
Feedback
All feedback for the present and the future of plugin support in posthog is extremely welcome!
The text was updated successfully, but these errors were encountered: