-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Health check question #2197
Comments
@mhaamann Throw an exception? |
Hmm, makes sense. |
I'd say that is really up to you. An uncaught exception will terminate your app, so when If you're using express (or similar), I'd suggest including a "graceful shutdown" middleware in your server setup that'll wait for sockets to be cleaned up before terminating. There are many implementations out there. |
I think there is a use case for having a real health check instead of throwing an exception. For example we have an application which has an http endpoint on /health which returns a 200 status code if everything is ok (db connection works, nosql store is reachable, etc). Throwing an exception is in my opinion not a acceptable solutions if that results in a application kill by PM2. Then we can also just call process.exit() if something goes wrong? Who says the application will not resurrect due to for example database connection pooling? |
+1 for health checking processes and better granularity of control at a process level. I have a pool of servers behind a load balancer which uses a health check (/healthcheck.js returns 200 OK) to take nodes out of the pool of active servers. Each server is running PM2 which clusters node instances to scale to available CPU resource. I am finding that it is possible for one node process to lockup (no logs or errors reported) but the server stays in the pool on load balancer because other processes are working. When this happens the service is effectively taken offline with no automatic recovery. |
I'm not sure i'm understanding what you want to implement because i think you already have the possibility do that, imagine an app like this : var http = require('http');
http.createServer(function(req, res) {
res.end('Done');
}).listen(3000);
function healthcheck(cb) {
// do your verification
return cb(null, true);
}
process.on('heathcheck', function(packet) {
healthcheck(function (err, data) {
var state = err ? false : true;
process.send({
type : 'process:msg:healthcheck',
data : {
err: err
data: data
state : state
}
});
});
}); And a worker (i advise you to do that with a pm2 module btw) : var pmx = require('pmx');
var pm2 = require('pm2');
pm2.connect(function() {
setInterval(function () {
// every 10 seconds, list process handled by pm2
pm2.list(function (err, list) {
list.forEach(function (process) {
// and for each send a healthcheck request
pm2.sendDataToProcessId({
type : 'healthcheck',
data : {},
id : proc1.pm2_env.pm_id
}, function(err, res) {
// response will be actually called using the EventBus of pm2
// but err can be filled with eventual error while communicating with pm2 daemon
});
}
}
}, 10000);
pm2.launchBus(function(err, bus) {
// listen for healthcheck response here
pm2_bus.on('process:msg:healtcheck', function(packet) {
// analyse your data here and do what you want like restart or whatever
console(packet);
});
});
}); I didnt tested this code but it should work as it rely on the pm2 api, specially this part . |
And what can be done if the process fails the health check or is frozen and does not return a response? |
As you can read in the API here, you can restart/stop process etc |
Ok, thank for the information. I can see the module and API being very useful. In terms of health checking frozen processes I would suggest this should probably an official module if it isn't already. I imagine lots of people would use this. |
@JamesBewley not sure about this because it depends a lot of your particular needs. |
@soyuka There will be a whole load of common logic specific to pm2 that has nothing to do with the application. Things like how to detect and handle of timeout of the event across the axon bus. There might be some configuration such as how long to wait before turning over the process but the rest will likely be common. |
I've started looking at this and put together a PM2 module from the information here. I think this needs to be provided by PM2 |
PM2 isnt supposed to crash, what will happen if an app doesnt respond is that no event will be emitted to the pm2 bus, and as i said, its not a priority since its possible to do this using the API. |
@vmarchaud Hello, is there any health check option for pm2? or i have to use middleware in my app? |
Yes, but this i no automatic, right? |
Nope, you may consider using Keymetrics which is a great monitoring software which will work seamlessly with PM2! If you already have nagios or similar there should be some plugins helping you to do so. |
Hi,
Is it possible to restart a process managed by pm2 if it fails a health check?
Can I connect directly to a process managed to pm2? At the moment i can only access the process through pm2 which means it is random which process I will actually hit. Mainly for debugging purposes.
The reason for asking is that we today found two processes (out of 5) that was failing the health check the haproxy was using. This lead to the server with pm2 running was being taken down every 5 minutes.
Thanks,
The text was updated successfully, but these errors were encountered: