Looking over internet they suggest to do something similar to the cubemap, where you take a 90x90 degree "picture" of your scene in 6 directions.
What do you think?
Moreover, this is something that should be done in background, I mean the image deliver will be just a file so should I take in account the idea to generate it in the back buffer without the swapping, in another separate buffer or something else?
Generally it would be better to render fewer pictures to avoid projection errors, so I would go for the FOV 90° view with 6 images. But if you use an external program to stitch the images together I guess using more images with smaller angles could help to get a better match.
Sure, glReadPixels is fine (as performance is not that important in your case). Another very simple solution would be to render to the front buffer and then use the JOGL utility screenshot class to directly dump to an image.