-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Loom] Interrupt behaviour differs from the system implementation #158
Comments
... from VirtualThreads, to be extra-compatible with the standard implementation. Add test cases to demonstrate the feature. #158
Thanks for reporting, @cenodis! Please verify against the current snapshot (changes referenced above). If there are still discrepancies between junixsocket and vanilla Java, please provide some code/unit test to demonstrate, this will be a tremendous help. |
The current implementation still throws unusual exceptions and keeps the socket open under some conditions. I have taken a stab at writing some unit tests for interrupt behaviour. The tests and the results on my linux machine can be found here: InterruptTest.java. They are not very pretty but should cover the interrupt behaviour for all blocking methods (accept, connect, read, write) on the standard Unix socket. Assuming the test code is free of bugs, here are some observations I have made based on the results:
The tests currently only cover the basic Unix domain socket. I unfortunately lack the means to test more OS specific socket types. But it should be fairly straightforward to extend the test with those if desired. |
This comment was marked as outdated.
This comment was marked as outdated.
Make the behavior of interrupted junixsocket sockets closer to the vanilla Java implementations. #158
@cenodis Thanks for providing these additional details! Can you please try the latest code (either from main or SNAPSHOT builds, 2.10.0-20240628.170903-8 or newer) and report back? Cheers! PS: I'm happy to add your test class you referenced above, if you're willing to contribute the code under the Apache 2.0 license to this project. |
I have run the exact same tests with the newest snapshot version ( Test [3] seems to fail or succeed somewhat randomly while [5] and [6] fail consistently on my machine. For completeness I have included a report from one of the runs where [3] has also failed. The tests are run on an Ubuntu machine with Linux 6.3.6 if that helps.
Not completely sure what you are trying to tell me with that. Why is the inheritance chain of
Sure. The code is so basic I have my doubts whether its even licensable. But if it makes you more comfortable I give my blessing to use this code with whatever license this project uses. I have also taken a look at the changes you have commited so far and there is something that stands out to me. You have a few blocks that have this shape: begin();
try {
// blocking op
} catch (/*exceptions*/) {
if (Thread.currentThread().isInterrupted()) {
throw new ClosedByInterruptException();
}
// more exception handling
} finally {
end(complete);
} This interrupt check strikes me as redundant. If an interrupt occurs after (or during) And another thing: try {
end(complete);
} catch (ClosedByInterruptException e) {
throw closeAndThrow(e);
}
|
Make the begin/end/handle-interrupt logic for Channels a little bit more concise. #158
... and add license header/Javadoc. #158
Make test compile/run on Java 8 and newer. Change assumptions around exceptions and interrupt state: - Change expectation of ClosedByInterruptException to AsynchronousCloseException (its superclass). - Only expect Thread#interrupted on actual ClosedByInterruptException. - Use temporary AFUNIXSocketAddress instead of hardcoded one to prevent flaky results. #158
When we already get a ClosedChannelException, do not let AbstractInterruptibleChannel#end rethrow another one. #158
The parametrized method variants are not useful as-is for the selftest. Let's simplify the naming a little bit, and reformat the source code so we can find the corresponding test variant more easily. #158
Please try again with the latest snapshot (2.10.0-20240630.191437-9 corresponding to commit 7025a04). I've reworked both the exception handling code as well as your unit test, which is now also included in the selftest. Thanks again, for allowing me to include the test, @cenodis!
Previously, the test code expected ClosedByInterruptException, whereas now any subclass of ClosedChannelException is valid. However, if ClosedByInterruptException is thrown, the test will check if the Thread#interrupted flag is set. The Java API specs around SocketChannel/ServerSocketChannel/DatagramChannel permit any kind of IOException to be thrown, not just ClosedByInterruptException and AsynchronousCloseException, particularly ClosedChannelException, which is the common parent class. That one should be thrown if the channel is already closed upon calling, for example. I think the main change, compared to older versions is that now we properly throw ClosedChannelException (an IOException subclass) instead of a junixsocket-specific SocketClosedException (a SocketException subclass), and that now in all cases the socket should be properly closed. If there are still discrepancies in what exceptions are thrown compared to the JVM implementations, I would argue that these are an implementation detail that should be ignored. Better follow the specs as to what may happen that what currently does...
Unfortunately, "closed" may mean multiple things , so we need to make sure that all resources are properly closed by calling our own close method when required. This currently may or may not be required, but it doesn't hurt since this code is only called when an exception is thrown. I hope that helps. |
Yes, and in my opinion this is the behaviour required by the specification. try {
channel.blockingOp();
} catch (ClosedByInterruptException e) {
// ...
} catch (ClosedChannelException e) {
// ...
} To quote
To me this reads that if the the blocking operation is interrupted then the method is required to throw a Additionally the blocking methods on Example from
2 and 3 should already be handled by
Yes, but only if you look exclusively at the function signature. And by that logic it could also throw
Agreed. I try to avoid including any implementation specific behaviour. See above for my understanding of the specification. And feel free to call out anything you see as implementation specific.
Thats fine. I just saw a piece of code that looked strange to me and wanted to point it out. Not to demand any changes. |
Previously, we were throwing SocketClosedException in cases where BrokenPipeSocketException was more appropriate. This enables us to better differentiate between AsynchronousCloseException (broken pipe, etc.) and any other ClosedChannelException. Adjust the corresponding unit tests, which now always either throw AsynchronousCloseException or ClosedByInterruptException. #158
Under some circumstances, we would erroneously throw a ClosedChannelException. Rework the code to not throw an exception unless necessary, and throw the proper exception based on the error code (ClosedChannelExceptions are now handled separately in Java code). #158
Previously, the test may fail sporadically due to a race condition between receiving a "thread interrupted" state on a live connection and one that has been closed already (resulting in an AsynchronousCloseException instead of a ClosedByInterruptException). Make sure that the client connection is closed from the servce side only after a delay that is significantly longer than any other delay in this test class. Run that close() in a separate Thread to avoid unnecessarily long delays. Finally, set all expected exceptions back to ClosedByInterruptException, as intended by the original author (cenodis). #158
Refactor the unit tests for issue 158 such that we can compare the behavior of junixsocket AFUNIXSocketChannel etc. and the Java 16+ JEP380 as well as regular Java inet versions. Add both tests to the selftest, but disable the JEP380/Inet versions by default (enable with -Dselftest.enable-module.junixsocket-common.JEP380=true -Dselftest.enable-module.junixsocket-common.JavaInet=true ) Add concise exception logging (enable with -Dselftest.issue.158.debug=true ). Lastly, move some JEP380-specific logic into its own class. #158
... and log the stacktrace for unexpected exceptions #158
OK, so I think I have it now. There are a few things I was able to change, and now we get the expected I had to modify your test case again in one particular regard though: Because of a race condition, the interrupt may have occurred after closing the socket from the server-side (there was no delay). In that case, the latest iteration of my changes correctly returned Moreover, I reworked the tests so we can now verify the exact behavior of JEP380 (and Java Inet) sockets on the same code (see the three subclasses in commit 0bed969) Surprisingly, without a delay of closing the socket (run with On the other end, adding a significant delay will cause the tests to work as expected in all cases. Regarding your concern about specific exceptions, I can only again stress that checking for specific exceptions is not recommended. Using exception handling for flow control is extremely brittle (I've run into it myself while fixing this bug; see commit 66fb640). By running the test code against the Java SDK JEP380 code, I was able to occasionally trigger a case where
So it's fair to say that your assumptions around certain exception types don't hold true even for the Java SDK JEP380 implementation. This actually goes against their own javadocs, see
I will file a bug to Oracle shortly afterwards. Notably, I agree that in this case it should have been an |
Please verify with code from the latest commit 9f5377f. You can also run the latest selftest jar as follows:
Try inserting
after the first line to disable the delay, which should then occassionally trigger the bug in JEP380 code (verified on a MacBook Pro M1 Max). It will also show that junixsocket will throw an |
Java bug reported to Oracle, JDK-8335600 |
Tested on Thank you for putting up with this nitpicky issue and updating everything to be in line with the spec. Even the optional parts in
That does seem very strange. I could imagine a world where the OS reads the byte into cache even if the consumer application is not ready yet. But that doesn't fit with the even greater delay resulting in a "not-send" state. Can't say I have an idea on what could cause this.
I see where I made the mistake now. Sorry about that. I reviewed it twice and have no idea how that slipped past me.
I actually agree with this sentiment. On the other hand sometimes exception types are the only way to meaningfully distinguish between different kinds of errors.
Agreed. This does seem like a bug in the SDK. |
@cenodis I'm glad we got this resolved. Investigating this bug was quite fruitful, as we found a couple more issues in junixsocket, plus a JDK bug :) |
junixsocket 2.10.0 has been released. Please verify and re-open if necessary. Thanks again for reporting , @cenodis ! |
Split off from #157
The current 2.10.0 snapshot has added support for virtual thread I/O. As part of this change the Sockets and SocketChannels have gained the ability to be interrupted to support task cancellation.
While junixsocket does properly respond to an interrupt by throwing an exception, the type of the exception thrown as well as the state of the socket and thread after an interruption do not match the default implementation in the JDK.
Short overview
This table shows the state of the program immediately after the virtual I/O thread has received an interrupt and returned to the calling code:
SocketException
ClosedByInterruptException
InterruptedIOException
InterruptedIOException
Details
Socket
While the current Socket implementation is valid according to the specification I think there is value in matching the behaviour of the default implementation as close as possible to minimize potential incompatibilities. This is especially relevant when using junixsocket to "fake" a normal IP socket and passing it into a library that normally does not support domain sockets.
SocketChannel
java.nio.channels.SocketChannel
on the other hand does document that it must throw aClosedByInterrupt
exception, interrupt the current thread and close the channel if it is interrupted during an I/O operation. So the current implementation is not strictly compliant with the specification.Open Socket and InterruptedIOException
The exception in combination with the still open socket may also present a rather unique issue.
The problem with leaving the socket open after an interrupt is that it's not obvious if the I/O operation has completed fully, failed or only partially completed. This leaves the socket in an inconsistent state in which the caller does not necessarily know which bytes should be processed next to ensure a well formed message is created.
If it is a read operation then certain bytes may also have been read from the underlying socket but not arrived at the caller due to the exception, requiring the socket to perform otherwise unnecessary buffering to ensure no information is lost.
It is theoretically possible to pass the necessary information to the caller via
InterruptedIOException
and itsbytesTransferred
field which according to its documentation:I would like to advocate against using this exception however.
Currently junixsocket does not set this field at all, meaning it is always 0. I am not sure this is correct as to my understanding it's possible for the async runtime to submit the operation to the OS before the interrupt exception is thrown.
Throwing
InterruptedIOException
along with leaving the socket open could result in code attempting to handle this by rewinding its buffer and playing the "missing" bytes back. But if the number of missing bytes is incorrect this will instead corrupt the socket stream.This entire "recovery" process feels very brittle and easy to misuse even if the byte count is correct. It may be even further complicated if there are multiple layers of buffering which would each need to rewind. Closing the socket implicitly on an interrupt avoids this scenario altogether since no replay can be attempted.
Considering how easy it is to implement custom interrupt behaviour with virtual threads and the fact that the JDK has also chosen this approach, I do not see much benefit to support these kind of half-writes.
The text was updated successfully, but these errors were encountered: