Hi
You should rather use binary (instead of XML) X3D to handle such a big model.
Do only operations requiring a current OpenGL context in the GLEventListener and do the rest elsewhere if it is possible. Personally, when I used raw JOGL without high level engines, I loaded models by pieces, each display() call loaded a single piece if there was still at least a remaining piece to load. You have only a single model, you can load it once and use a flag to indicate that it is already loaded in order to avoid loading it at each display() call. Just keep in memory the VBO(s) and the texture(s) of your model.
I don't really see what is difficult in that, it has nothing to do with OpenGL, just keep the data required to render the model multiple times and use a flag to avoid loading it several times.
Of course, I agree with Wade.