-
-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGSEGV during SSL.freeSSL on netty-tcnative-boringssl-static v2.0.61.Final #842
Comments
We're also seeing this crash, it seems to occur more frequently on services that have more ssl sessions per timeframe. |
Seems to be the same report in issue #833 |
He're our hotspot error file, it happens with both java 11 and java 21. |
I would like to add that we're using netty via vert.x and the crash only appears when multiple server instances are deployed, when only one instance is deployed the crash does not happen. |
Sorry, I spoke too soon, it also happens when only one verticle is deployed. |
Just to give a recap of what happened yesterday: a few of our instances were overloaded (possible because of a DDoS attack) and under that load the process with openssl enabled would crash in a few seconds after start. If we switched the same process to the jdk ssl implementation the process would not crash. I'm not sure how helpful is this but it's a pointer into the right direction, there is a bug in the openssl implementation that appears with a frequency proportional to the usage of ssl code. This also happens in:
|
Let me have a look... Never saw this in prod here tho. |
@conet @gavinbunney would it be possible to run with: |
I will try to create a reproducer, if the number of ssl sessions is high enough is should work. |
@conet thanks a lot |
I wonder if it might be caused by #850 |
@conet @gavinbunney please check if this still happens with 2.0.63.Final |
Unfortunately it is still happening I will try to create the reproducer. |
@conet ok... waiting for the reproducer then as I cant reproduce |
To be able to create the reproducer I tried to create a client that would crash the version that used 2.0.62.Final, I failed to do that no matter how I tried to overload the server which means that the public traffic that was causing this contains something that makes it more likely to happen and I failed to simulate that (we only saw the crash on traffic open to the internet). It's hard to find a tool out there that simulates connection open/close, I tried |
@conet without a reproducer it is almost impossible for me to find the root cause... I inspected the code but cant see anything wrong atm :/ |
Thanks @normanmaurer for the changes. We are still seeing the crashes as well, with the same hotspot error running
|
Do you have a reproducer ? |
Not yet :( We do see some logs with (about 70s before the crash):
|
bummer... keep me posted. I tried everything to reproduce but no look :/ |
Does this still happen with 2.0.66 ? |
We periodically see crashes in
io.netty.internal.tcnative.SSL.freeSSL
runningnetty-tcnative-boringssl-static v2.0.61.Final
(withnetty 4.1.106.Final
). This appears to happen on around 8 instances in our fleet each day, without any particular noticeable repro pattern.The invalid memory reference happens during the ssl engine shutdown,
sslReadErrorResult
, when freeing the ssl engine refs.full hotspot error - hs_err_pid4344.log
I have a few other hotspot error logs as well, but their thread stack show the same information:
The text was updated successfully, but these errors were encountered: