JVM Crash with SIGSEGV

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

JVM Crash with SIGSEGV

huy1912
This post was updated on .
While testing the ShutdownHook for the SIGTERM, got below JVM crash

#0  0xb7fe87a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb7eb2825 in raise () from /lib/tls/libc.so.6
#2  0xb7eb4289 in abort () from /lib/tls/libc.so.6
#3  0xb78421ff in os::abort () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#4  0xb798aa87 in VMError::report_and_die () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#5  0xb798b651 in crash_handler () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#6  <signal handler called>
#7  0xb75e46f9 in GenCollectedHeap::is_in () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#8  0xb783f401 in os::print_location () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#9  0xb784a078 in os::print_register_info () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#10 0xb798a08b in VMError::report () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#11 0xb798a990 in VMError::report_and_die () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#12 0xb784951c in JVM_handle_linux_signal () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#13 0xb78451e4 in signalHandler () from /usr/java/jdk1.6.0_43/jre/lib/i386/server/libjvm.so
#14 <signal handler called>
#15 0x535ad1f9 in glXChooseVisual () from /usr/lib/libGL.so.1
#16 0x0ce3b3c8 in ?? ()
#17 0x00000000 in ?? ()

Note: Unable to to systematically reproduce the JVM crash, but saw the coredump once during testing.

- System Overview
The system is having multiple dialogues which display the drawing data on GLCanvas by using JOGL 2.1.3. Each GLCanvas is using its own animation thread to draw its own data. During drawing, each animation thread will create deep clone data provided by data thread in order to avoid the concurrent issue.

The VM arguments for the java process: -Djogl.1thread=false Djogamp.common.utils.locks.Lock.timeout=10000

- Shutdown requirement
When the system receives the SIGTERM signal, the ShutdownHook is used to clean up the services related to data and terminate the running JVM (System.exit(0)). This is to ensure that there is no resource leak on the server side.

- Issue
JVM crash occurred after invoking the above JVM termination (System.exit(0)).

- Analysis
It's likely that some pending animation threads were still running even though System.exit(0).
I couldn't figure out why the glXChooseVisual gets invoked.
Inspected from JOGL implementation and found that the glXChooseVisual gets invoked via the X11GLXGraphicsConfigurationFactory.chooseGraphicsConfigurationXVisual which is invoked by GLCanvas.chooseGraphicsConfiguration when the GLCanvas is added to the container.
Below is the sequence diagram

Sequence Diagram

So this assumption is not correct as there was no GLCanvas added to the Container.

Couldn't find the native code to find out why the glXChooseVisual gets invoked with the <signal handler called>. I did check on the NativeWindowFactory where is listening to the ShutdownHook event, but it doesn't result in calling the glXChooseVisual.

I deeply appreciate if someone can shed a light on this as need to provide the analysis for the crash.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

gouessej
Administrator
Hi

Please provide a SSCCE. It's difficult to answer your question without knowing exactly what is executed. Maybe isGLXVersionGreaterEqualOneThree() calls glXChooseVisual but it seems to be far-fetched.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
Thanks for your response. I updated the question with scenario and analysis. Hope you can help on this.
Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

gouessej
Administrator
Please use JOGL 2.3.2. We don't maintain obsolete versions, we won't backport any fixes.

Maybe your animation threads go on running. It's difficult to know the root cause with no access to your source code. You should try to stop those threads or to do something so that they don't affect the canvases. You can dispose these canvases too. Do you use an animator?
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
1.  I can't update to the latest JOGL version due to project constraints.

2. You're right that some animation thread are still running before the animation thread is stopped.
I know that it's very hard to troubleshoot the root cause without looking the source code. Unfortunately, the source cannot be shared due to proprietary nature.

I had thought of disposing the Canvas (GLCanvas.dispose()) which will pause the animation thread and make the GLDrawable invalid.

But the root cause is not related to display(), reshape() methods which need to validate the GLDrawable.
Actually, I couldn't find the root cause why the glXChooseVisual gets invoked by the <signal handler called> in native code. Unfortunately, I can't find the native source code to analyse the root cause.

3. What do you mean "animator" on your question of "Do you use an animator"?
In my project, the FPSAnimator is used for re-drawing the display of the canvas if there is any data updated.
And the drawing is handled by its own FPSAnimator thread.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

gouessej
Administrator
huy1912 wrote
1.  I can't update to the latest JOGL version due to project constraints.
We only maintain the very latest version. If you're affected by a bug, we'll be totally unable to help you. You use a noticeably old version, numerous changes occurred in the meantime.

huy1912 wrote
2. You're right that some animation thread are still running before the animation thread is stopped.
I know that it's very hard to troubleshoot the root cause without looking the source code. Unfortunately, the source cannot be shared due to proprietary nature.

I had thought of disposing the Canvas (GLCanvas.dispose()) which will pause the animation thread and make the GLDrawable invalid.

But the root cause is not related to display(), reshape() methods which need to validate the GLDrawable.
Actually, I couldn't find the root cause why the glXChooseVisual gets invoked by the <signal handler called> in native code. Unfortunately, I can't find the native source code to analyse the root cause.
If you want to get some help, provide at least a SSCCE. If the root cause of your problem is outside of the snippet you gave, we will never find the solution and anyway, you need to isolate exactly what causes your problem.

huy1912 wrote
3. What do you mean "animator" on your question of "Do you use an animator"?
In my project, the FPSAnimator is used for re-drawing the display of the canvas if there is any data updated.
And the drawing is handled by its own FPSAnimator thread.
Then you can stop the animator and remove the canvas(es) from it.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
I couldn't reproduce the JVM crash so couldn't provide the SSCCE as suggested. I know it's very hard to troubleshoot without concrete steps or scenario.

What I need to do is to try more to find the systematic steps to reproduce the crash. If no luck, I need to manually stop the animator and remove canvas as suggested.

Thanks a lot for your time and effort in helping troubleshoot the issue.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

gouessej
Administrator
huy1912 wrote
What I need to do is to try more to find the systematic steps to reproduce the crash. If no luck, I need to manually stop the animator and remove canvas as suggested.
Stopping the animator and remove the canvas from it is a smart solution. What's wrong with that?
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
Nothing wrong with your suggested solution.
The thing is that I need to provide the root cause analysis as well as the solution as a part of my project process. Sometimes the customer may reject the fix wit a improper/unclear root cause analysis.

Thanks for your time and effort.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

gouessej
Administrator
I understand your position, it's more professional. I'll let you know if I can clarify something.

Edit.: Calling some OpenGL or X11 methods not from the right thread / process is usually a bad idea, especially when the native memory of some structures has already been released.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

Xerxes Rånby
gouessej wrote
I understand your position, it's more professional. I'll let you know if I can clarify something.

Edit.: Calling some OpenGL or X11 methods not from the right thread / process is usually a bad idea, especially when the native memory of some structures has already been released.
It is highly likely that the crash huy1912 experience is caused by not calling AWT + Swing methods from the event dispatch thread.

When we tested the 2.1.3 release we did run into a similar looking crash in glXChooseVisual alternatively in glXCreateNewContext when the AWT removeNotify was called from a non AWT-EDT thread by the IcedTea-web browser plugin.
https://jogamp.org/bugzilla/show_bug.cgi?id=910
https://gist.github.com/xranby/3e3b4ebd5b1fd67cef13 - sigsegv [libGL.so.1+0x6bf82] glXChooseVisual+0x6472 during GraphTextDemo applet window close on 32bit ubuntu 12.04 + icedtea-web using opengl 4.3 core profile 4.3.0 NVIDIA 319.32 drivers

To be sure huy1912 exerience a issue then we need to see your full hotspot error log.
Please provide all information stated in: https://jogamp.org/wiki/index.php/Jogl_FAQ#Bugreports_.26_Testing

If we have a testcase and a full bugreport then we can investigate if a similar workaround that we added to NewtCanvasAWT may be added to GLCanvas to prevent the crash even if removeNotify was called from the wrong thread.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

jmaasing
Xerxes Rånby wrote
It is highly likely that the crash huy1912 experience is caused by not calling AWT + Swing methods from the event dispatch thread.
Very good guess. I think you are not even allowed to close windows in other threads than the Swing EDT. That means if you have a shutdown hook that calls setVisible(false) or similiar you are breaking the threading contract of AWT/Swing and 'unexpected' behaviour is actually to be expected.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
In reply to this post by gouessej
gouessej wrote
Edit.: Calling some OpenGL or X11 methods not from the right thread / process is usually a bad idea, especially when the native memory of some structures has already been released.
There was no calling to OpenGL or X11 methods from the ShutdownHook thread when the SIGTERM was triggered.
And I am very sure that there was no GLCanvas added to or removed from the Container during this time as well as closing of the window.
So I doubt that some Animiator thread(s) were trying to access the native method which resulted in the JVM crash.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
In reply to this post by Xerxes Rånby
Xerxes Rånby wrote
It is highly likely that the crash huy1912 experience is caused by not
calling AWT + Swing methods from the event dispatch thread.
As mentioned in the original post, the system using its own animator thread to draw on each GLCanvas (-Djogl.1thread=false) instead of heavily relying on the EDT thread for drawing. The EDT thread is used to manage the UI events as well as provide the drawing data to each GLCanvas if UI update needs to be redrawn.

Xerxes Rånby wrote
To be sure huy1912 exerience a issue then we need to see your full hotspot error log.
Please provide all information stated in: https://jogamp.org/wiki/index.php/Jogl_FAQ#Bugreports_.26_Testing

JOGL, Platform and OpenGL Version: see test.log
Hotspot error log: see hs_err_pid3137.zip.

Xerxes Rånby wrote
If we have a testcase and a full bugreport then we can investigate if a similar workaround that we added to NewtCanvasAWT may be added to GLCanvas to prevent the crash even if removeNotify was called from the wrong thread.
GLCanvas.removeNotify explicitly mentions that User shall not call this method outside of EDT, read the AWT/Swing specs about this. So this method is not invoked outside of EDT in my system.
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

huy1912
In reply to this post by jmaasing
jmaasing wrote
That means if you have a shutdown hook that calls setVisible(false) or similiar you are breaking the threading contract of AWT/Swing and 'unexpected' behaviour is actually to be expected.
The shutdown hook in my system is only used to close the connection to the server as well as clean up resources. It doesn't invoke any methods related to the AWT/Swing.

What I am going to avoid the crash is to invoke GLCanvas.destroy() in the Shutdown hook thread. As checked the GLCanvas.destroy implemention, I think it's safe to invoke this method from non-EDT which disposes the GL and makes GLDrawable become invalid by setting the GLCanvas.drawable to null.

In your opinion, do you think that there is any issue invoking GLCanvas.destroy() which may result in JVM crash?
Reply | Threaded
Open this post in threaded view
|

Re: JVM Crash with SIGSEGV

jmaasing
huy1912 wrote
In your opinion, do you think that there is any issue invoking GLCanvas.destroy() which may result in JVM crash?
IDK but to be safe you better look in the source. In general OpenGL is not thread safe either so if that method makes calls to OpenGL it can result in native crashes. Maybe you can schedule the destroy operation on the EDT and you (hopefully) should be ok.