JOGL/JDK7 RC10+ Display Issue, With Test Case

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

JOGL/JDK7 RC10+ Display Issue, With Test Case

rhatcher
This problem is related to AWT/JOGL Deadlock, Revisited.  I've started another topic because although the problem presents itself similar to the noted post, the jstack results are slightly different and I thought it best to not mix up the associated files.

Relevant versions:

- Fedora 12
- Sun/Oracle JDK 7u6.
- JOGL/Gluegen I built using JDK 6u33 from a cloned repo as of Friday 8/10/2012 Note that I tried building 08/16 repos but got a failure in jogl building c.build.newt.windowlib.

I've uploaded files JoglHangTest1.java and BigRedXCanvas.java which can be used to reproduce the problem.

In my testing so far this problem appears specific to JDK7, and happens on both 7u5 and 7u6.

The test app will initially create a small window with a single button labeled "Make Frame".  When I press this button on my system when running with JDK7 I get one of three results:

1) Everything will work, and I get the expected JOGL window with a big red X on a black background.

2) The JOGL window will come up but be blank/gray until I wiggle the mouse, at which point it will draw.

3) The JOGL window will come up, be blank/gray, and never draw.  Additionally, the "Make Frame" button will remain depressed and the UI will become unresponsive as shown in this screen snippet.

When I run under JDK 6u33 I so far have always gotten result 1).

However, I ran 72 times (there is nothing special about 72; that's just where I happened to get sick of running the thing) under JDK 7u6 and got result 1) 5 times, result 2) 64 times, and result 3) 3 times.

When result 3 happens I get jstack output like this.

With this particular test app result 3 is almost always a deadlock, though one time I saw it present with the Animator thread stack trace similar to the post I noted at the front of this message.

Can anybody reproduce similar results?


In case it's relevant... the mechanics of the failure noted in AWT/JOGL Deadlock, Revisited are indirectly related to JOGL interrupting the AWT EventDispatchThread as in this stack trace:

    at java.lang.Thread.interrupt(Thread.java:971)
    at java.awt.EventDispatchThread.interrupt(EventDispatchThread.java:97)
    at jogamp.common.util.locks.RecursiveThreadGroupLockImpl01Unfairish.unlock(RecursiveThreadGroupLockImpl01Unfairish.java:195)
    at jogamp.common.util.locks.RecursiveLockImpl01Unfairish.unlock(RecursiveLockImpl01Unfairish.java:264)
    at javax.media.opengl.GLProfile.initSingleton(GLProfile.java:182)
    at javax.media.opengl.GLProfile.initSingleton(GLProfile.java:199)
    at javax.media.opengl.GLProfile.getProfileMap(GLProfile.java:1914)
    at javax.media.opengl.GLProfile.get(GLProfile.java:848)
    at jogamp.opengl.x11.glx.X11GLXDrawableFactory$SharedResourceImplementation.createSharedResource(X11GLXDrawableFactory.java:254)
    at jogamp.opengl.SharedResourceRunner.run(SharedResourceRunner.java:261)
    at java.lang.Thread.run(Thread.java:741)

Maybe this is normal/expected (?).  But... under JDK7 this - in conjunction with some other AWT stuff that has changed a lot since JDK6 - ultimately renders the AWT event dispatch thread unwilling to process events.  The InvocationEvents for drawing end up getting backed up in a queue that's no longer being serviced, and the event dispatch thread wants to exit but can't meet the required exit conditions.  Because the event dispatch thread can't exit - but also has been interrupted and so will no longer pump events - everything freezes up in the UI.

I could get into the ugly details of it and will do so if you think it will be helpful.
Reply | Threaded
Open this post in threaded view
|

Re: JOGL/JDK7 RC10+ Display Issue, With Test Case

gouessej
Administrator
I didn't know that AWT and Swing have so much changed in Java 1.7. I think that you have enough information to write a proper bug report. Are we really concerned by these invocation events? I thought they were used only by JavaFX.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JOGL/JDK7 RC10+ Display Issue, With Test Case

rhatcher
Hi Julien,

> I didn't know that AWT and Swing have so much changed in Java 1.7. I think that
> you have enough information to write a proper bug report.  Are we really concerned
> by these invocation events? I thought they were used only by JavaFX.

At least since 1.5 - and possibly earlier - InvocationEvents are the way anything executed by either EventQueue.invokeLater and EventQueue.invokeAndWait get put on the queue for processing.  This ultimately includes Animator's display events.

Consider this stack trace, similar to the one from my other post related to deadlocks.  Note that the line numbers on this stack trace may not agree with the stock src.zip because I had to use a custom AWT build to figure out what was going on:

"Timer-6" prio=10 tid=0xa58ed000 nid=0x4c3c in Object.wait() [0xa5cc7000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0xad0184a8> (a java.awt.EventQueue$1AWTInvocationLock)
    at java.lang.Object.wait(Object.java:503)
    at java.awt.EventQueue.invokeAndWait(EventQueue.java:1232)
    - locked <0xad0184a8> (a java.awt.EventQueue$1AWTInvocationLock)
    at jogamp.common.awt.AWTEDTExecutor.invoke(AWTEDTExecutor.java:53)
    at jogamp.opengl.awt.AWTThreadingPlugin.invokeOnOpenGLThread(AWTThreadingPlugin.java:103)
    at jogamp.opengl.ThreadingImpl.invokeOnOpenGLThread(ThreadingImpl.java:206)
    at javax.media.opengl.Threading.invokeOnOpenGLThread(Threading.java:172)
    at javax.media.opengl.Threading.invoke(Threading.java:191)
    at javax.media.opengl.awt.GLCanvas.display(GLCanvas.java:428)
    at com.jogamp.opengl.util.AWTAnimatorImpl.display(AWTAnimatorImpl.java:74)
    at com.jogamp.opengl.util.AnimatorBase.display(AnimatorBase.java:142)
    at com.zedasoft.cba.view.DisplaySetAnimator.display(DisplaySetAnimator.java:105)
    at com.jogamp.opengl.util.FPSAnimator$1.run(FPSAnimator.java:128)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)

GLCanvas wants to redisplay, and ultimately this leads to EventQueue's invokeAndWait method, which does this in at least 1.5-22, 6u33, and 7u6:

        class AWTInvocationLock {}
        Object lock = new AWTInvocationLock();

        InvocationEvent event =
            new InvocationEvent(Toolkit.getDefaultToolkit(), runnable, lock,
                                true);

        synchronized (lock) {
            Toolkit.getEventQueue().postEvent(event);
            while (!event.isDispatched()) {
                lock.wait();
            }
        }

The runnable that will do the redraw gets wrapped in an InvocationEvent and posted to the queue, then we wait for the InvocationEvent's dispatch() method to get called.

In my other post about deadlocks the "while" loop in the snippet above is where it gets hung.  The presentation in that post was slightly different than the hangs I observed in the test case included with this post.  I'm not sure the two are related, but my guess is that they are.

As mentioned previously, the InvocationEvent ends up never getting dispatched because it's backed up in a queue that's no longer being serviced.  Event servicing stops because the event dispatch thread has been interrupted indirectly by earlier GLProfile singleton initialization.

It may be this way in 1.6 and earlier too, but in 1.7 I observed that an app that works will start with one event dispatch thread that will run for a while and then exit.  Later a "wake up" happens due to other events in the system, and a second event dispatch thread is created and picks up where the first thread left off.  From this point on everything works.  I'm not sure what these other events are, but window stuff seems to be part of it.

In an app that hangs the first event dispatch thread is prevented from exiting due to what appears to be a race.  The run method in EventDispatchThread looks like this in 1.7 (this is different than in 1.6):

    public void run() {
        while (true) {
            try {
                pumpEvents(new Conditional() {
                    public boolean evaluate() {
                        return true;
                    }
                });
            }
            finally {
                if(getEventQueue().detachDispatchThread(this, shutdown)) {
                    break;
                }
            }
        }
    }

It turns out that sometimes the system will get into a state where the detatchDispatchThread() call can't succeed even if the "shutdown" field is true.  Note that "shutdown" gets set to true as soon as the event dispatch thread is interrupted indirectly via the earlier GLProfile singleton initialization, which also is what makes the thread stop servicing events.

The detachDispatchThread method is prevented from succeeding because this method:

SunToolkit.isPostEventQueueEmpty()

...returns false.  So, I guess the race is that sometimes by the time the event dispatch thread wants to exit there is nothing on the PostEventQueue, and other times something has been put there at just the wrong time.

The end result is that everything stops working, apparently for the life of the app.


As far as AWT changing a lot in 1.7, consider this method from EventQueue.java as it existed in 1.5 and 1.6u33:

    final void detachDispatchThread() {
        dispatchThread = null;
    }

In 1.7u6 it looks like this:

    final boolean detachDispatchThread(EventDispatchThread edt, boolean forceDetach) {
        /*
         * This synchronized block is to secure that the event dispatch
         * thread won't die in the middle of posting a new event to the
         * associated event queue. It is important because we notify
         * that the event dispatch thread is busy after posting a new event
         * to its queue, so the EventQueue.dispatchThread reference must
         * be valid at that point.
         */
        pushPopLock.lock();
        try {
            if (edt == dispatchThread) {
                /*
                 * Don't detach the thread if any events are pending. Not
                 * sure if it's a possible scenario, though.
                 *
                 * Fix for 4648733. Check both the associated java event
                 * queue and the PostEventQueue.
                 */
                if (!forceDetach && (peekEvent() != null) || !SunToolkit.isPostEventQueueEmpty()) {
                    return false;
                }
                dispatchThread = null;
            }
            AWTAutoShutdown.getInstance().notifyThreadFree(edt);
            return true;
        } finally {
            pushPopLock.unlock();
        }
    }

Yuck.  I haven't diff'd the AWT code but I'd guess there are plenty of other equally invasive changes in JDK7.  One look and it's easy to believe there are going to be problems somewhere.


This problem and the one from my other post regarding deadlocks appear to require getting EventQueue.invokeLater and/or EventQueue.invokeAndWait in the loop.  In the test app I included obviously we don't need to call invokeLater since the actionPerformed call is already in the event dispatch thread.  However... I included the call to invokeLater because this is closer to what our full-up system does.  We end up having to do this because we don't know what kind of front-end graphics we're going to need until some notification arrives over RMI from other apps.  This of course does not happen in AWT's event dispatch thread, so we end up having to create the UI elements through invokeLater.


There are a few specific things I wanted to get out of this post if possible:

- Do others also observe similar misbehavior from the test app when running under 7u5 or 7u6 (including more deadlocks)?

- I think this is not a JOGL problem, but rather a race in the JDK7 AWT.  Do you concur?

- Do you consider the interruption of the event dispatch thread by the GLProfile singleton initialization to be normal/expected?  I suspect the answer is yes.  Even if the GLProfile init didn't interrupt the event dispatch thread then something else might, so the fact that this has happened shouldn't be considered a problem.

- Can anybody suggest a workaround that will work reliably with JDK7 and SwingUtilities.invokeLater?
Reply | Threaded
Open this post in threaded view
|

Re: JOGL/JDK7 RC10+ Display Issue, With Test Case

gouessej
Administrator
Why not just pushing another EventQueue without this fix when we detect that we are using Java 1.7? We'll have to provide a better fix if we support JavaFX one day.
Julien Gouesse | Personal blog | Website
Reply | Threaded
Open this post in threaded view
|

Re: JOGL/JDK7 RC10+ Display Issue, With Test Case

rhatcher
In reply to this post by gouessej
Possible workaround (?):

I attempted to reproduce this problem with Java2D without JOGL in the loop but couldn't make it fail.

This pointed another finger at the GLProfile initSingleton method's indirect interruption of the event dispatch thread as the triggering event that eventually leads to the hangs, so I took a closer look at this method and noticed that its javadoc said:

"Applications using this method may place it's (sic) call before any other UI invocation in the main class's static block or within the main function.  In such case, applications may pass firstUIActionOnProcess=true to use native toolkit locking."

Based on this suggestion I added a call to GLProfile.initSingleton() to the test app as one of the earliest things it does and this hang problem seems to have disappeared in JDK7.  GLProfile.initSingleton(true) also worked, though it is deprecated.  This also seems to fix the similar issue I was having in our full-up software set.

I'm not sure what this says about how GLProfile.initSingleton() works.  I'm still inclined to believe this is a race in JDK7's AWT that initSingleton() just happened to stimulate at the wrong time.


Is there any disadvantage to making the initSingleton call early?  If not then I'm going to forge ahead with this workaround.
Reply | Threaded
Open this post in threaded view
|

Re: JOGL/JDK7 RC10+ Display Issue, With Test Case

Sven Gothel
Administrator
On 08/17/2012 09:20 PM, rhatcher [via jogamp] wrote:

> Possible workaround (?):
>
> I attempted to reproduce this problem with Java2D without JOGL in the loop but
> couldn't make it fail.
>
> This pointed another finger at the GLProfile initSingleton method's indirect
> interruption of the event dispatch thread as the triggering event that
> eventually leads to the hangs, so I took a closer look at this method and
> noticed that its javadoc said:
>
> "Applications using this method may place it's (sic) call before any other UI
> invocation in the main class's static block or within the main function.  In
> such case, applications may pass firstUIActionOnProcess=true to use native
> toolkit locking."
>
> Based on this suggestion I added a call to GLProfile.initSingleton() to the
> test app as one of the earliest things it does and this hang problem seems to
> have disappeared in JDK7.  GLProfile.initSingleton(true) also worked, though
> it is deprecated.  This also seems to fix the similar issue I was having in
> our full-up software set.
>
> I'm not sure what this says about how GLProfile.initSingleton() works.  I'm
> still inclined to believe this is a race in JDK7's AWT that initSingleton()
> just happened to stimulate at the wrong time.
GLProfile.initSingleton() calls GLProfile.initSingleton(false)
and initializes the whole JOGL GLProfile/GLContext availability state.
This task is pretty intensive and also involves kicking off
a designated thread holding a shared context/drawable for availability features
and to provide a common GLContext where necessary.
This off-thread initialization blocks the calling thread, AWT-EDT here, until finished.
Even though it is not clear to me how it freezes AWT-EDT indefinetly
for sure this complication has potential to cause havoc, as you have experienced.

Yes, hence it is a good idea to trigger initialization upfront in a more controlled
environment - read: out of AWT-EDT.
Usually this is done by any public GLProfile method like 'GLProfile GLProfile.get*(*)'
which is considered to be run from a user thread - at least in all of our use cases.

Maybe there is some new complication w/ JDK7 which exposes this regression.. dunno.

I will try your use case in a bit later on.

>
>
> Is there any disadvantage to making the initSingleton call early?  If not then
> I'm going to forge ahead with this workaround.

Nope. It's actually good to have it done ASAP, allowing you to query some
JOGL availability - or to be allowed to crash w/o any other dependencies
if there is a JOGL bug etc.

Note [to myself]: I will cleanup the code path regarding the passed boolean 'firstUIActionOnProcess'
so there is no more behavioral difference. This is important since the method is deprecated
and same behavior is intended. Currently the only difference is _when_ X11 XInitThreads() gets called.

~Sven


signature.asc (910 bytes) Download Attachment