-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall: add support for Windows job objects #17608
Comments
This is not a good way to start a bug report. You've now established yourself as an adversary rather than a collaborator. Something to keep in mind for future bug reports. Tone matters. |
I'm sorry, but we need more information to understand what you are asking for on the Go side. I did read nodist/issues#179, but there is not a line of Go code in sight. You also mentioned reopening #6720, which as I read it is about p.Signal(os.Interrupt), where p is an os.Process, not being implemented and therefore always returning an error on Windows. That issue was closed by 05cc78d8, which did:
As discussed in #6720, we don't know of an obvious way to implement p.Signal(os.Interrupt) on Windows, or we would have. But none of us are Windows experts. A few long time users, maybe, but not experts. We do the best we can by reading the Microsoft documentation and a liberal amount of Stack Overflow and trial and error. One possibility is that nodejs ctx.child.kill('SIGTERM') is sending a Ctrl-Break to the entire process group, but if that were the case I don't understand why that wouldn't hit both the go wrapper and the nodejs server it started. Another possibility is that nodejs ctx.child.kill('SIGTERM') knows some kind of magic to send a Ctrl-Break to just that one process. If so and you can tell us what that magic is, we can probably implement it in Go. Another possibility is that SIGTERM doesn't mean Ctrl-Break at all here. But then what does it mean? I looked in github.com/nodejs/node and there is no mention of what SIGTERM means on Windows. They must be using the Microsoft C runtime library, but what does that do? I looked in the mingw sources and the closest I found was mingw/include/signal.h, which says:
"SIGTERM comes from what kind of termination request exactly?". Exactly. It sounds like you maybe you know what Go should be doing instead, or maybe what SIGTERM means on Windows. If so, can you tell us? Thanks. |
It looks like Node uses libuv, which treats SIGKILL, SIGTERM, and SIGINT the same on windows: https://github.com/nodejs/node/blob/db1087c9757c31a82c50a1eba368d8cba95b57d0/deps/uv/src/win/process.c#L1166 |
@pbnjay, thanks for finding that. To recap, there is a Node parent that ran a Go process that ran a Node child. On Unix systems, if the Node parent sends SIGTERM ("please stop") to the Go process, then the Go process's signal handler runs and can do something in response to the signal, like send SIGTERM to the Node child, wait for the Node child to exit gracefully, and then exit itself. On Windows, my reading of what @pbnjay found is that the Node parent calls TerminateProcess on the Go process. That doesn't send a nice "please stop" to the Go process. It just terminates it, like Unix SIGKILL. There is no signal sent, no time to react; the operating system just destroys the process. In this case the Node child is left behind. It would have to be: the Go process had no chance to do anything. On Linux, there is still a way to cope with SIGKILL. When the Go program starts the subprocess, it can pass a SysProcAttr with Pdeathsig!=0, which makes the forked child call prctl(PR_SET_PDEATHSIG, Pdeathsig) before exec'ing the actual new program. That setting means "if my parent dies, send me this signal", so that even if the Go program dies with kill -9 or some other path that forgets to do cleanup, the child can be notified that the parent is gone and clean up after itself. It looks like maybe the Windows equivalent of PR_SET_PDEATHSIG is "job objects". It is unclear to me whether this still works in current versions of Windows, but some way to support that would be the obvious next thing to try. I'm going to retitle this bug to be about that. |
I'm away from keyboard until Sunday, however - I wanted to say I appreciate the seriousity and professionalism the thread gets, despite the aforementioned ...tone. I wish I had more lower level info - I would have shared it at start. I can be just a bit more elaborate, I will next week. |
In case it implies on your priorities - I owe you an update. Specifically for my case a workaround has been found: This results in removing the 3rd and 2nd last steps, jumping streight to the last. In this case, the SIGTERM terminates the server and the system works as expected without leaving hung processes. However, we're still get hanging processes whenever I FYI. |
FYI, AssignProcessToJobObject fail on Windows7. AFAIK, it have to terminate process with walking children processes using CreateToolhelp32Snapshot on Windows7 or older. One another issue, as alex said in #6720, GenerateConsoleCtrlEvent have another problem. The API require "console". So if the process doesn't have a console, it doesn't work. For example, if the process call AllocConsole, it works fine. |
I do not know about PR_SET_PDEATHSIG, but you can use "job objects" to control process groups on Windows. I even have github.com/alexbrainman/ps package with some APIs. We have used "job objects" to collect benchmark run statistics in golang.org/x/benchmarks/driver. From what I remember "job objects" provide facilities for child processes to start their own group too, so you would need some cooperation from your clients.
I think it works on all Go supported Windows versions. Alex |
/cc @johnsonj |
…eebsd When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
I would like to reopen #6720.
I excitedly read all the thread, ending with the greatest facepalm I had in the past years.
Look, this is not a solution!
If
nodejs
knows how to pass these signals between processes on windows - then no reason thatgolang
should not. No magic involved.Full story bellow.
Mind that if the
go
shim is out of the loop - it works as expected, so I deduct it has to be something thatgolang
does wrong.I would appreciate your input on this...
What version of Go are you using (
go version
)?donno. the one that
[email protected]
usesWhat operating system and processor architecture are you using (
go env
)?Windows 10, x64
What did you do?
I'm using
nodist
which allows me to run different versions of nodejs side by side.nodist
uses a shim layer ofgo
to have a look around on the env vars and local folder to detect the desired node version and call the relevant executable accordinglySee here:
nodists/nodist#179
What did you expect to see?
nodists/nodist#179
What did you see instead?
nodists/nodist#179
The text was updated successfully, but these errors were encountered: