A Jira server I administer just hung, and a stack trace shows lots of BLOCKED threads; either like:

"ajp-nio-127.0.0.100-8009-exec-182" #640282 daemon prio=5 os_prio=0 cpu=0.51ms elapsed=147.51s allocated=11696B defined_classes=0 tid=0x00007f6ce847c000 nid=0x2102f6 waiting for monitor entry  [0x00007f6c97b97000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(java.base@11.0.17/NativePRNG.java:544)
        - waiting to lock <0x0000000500fc0270> (a java.lang.Object)
        at sun.security.provider.NativePRNG.engineNextBytes(java.base@11.0.17/NativePRNG.java:220)
        at java.security.SecureRandom.nextBytes(java.base@11.0.17/SecureRandom.java:751)
        at java.util.UUID.randomUUID(java.base@11.0.17/UUID.java:150)
        at com.atlassian.jira.instrumentation.DefaultInstrumentationListenerManager.startContext(DefaultInstrumentationListenerManager.java:149)
...

or:

"ajp-nio-127.0.0.100-8009-exec-186" #640286 daemon prio=5 os_prio=0 cpu=0.96ms elapsed=106.49s allocated=126K defined_classes=0 tid=0x00007f6ce83e9800 nid=0x210991 waiting for monitor entry  [0x00007f6c9b524000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(java.base@11.0.17/NativePRNG.java:544)
        - waiting to lock <0x0000000500fc0270> (a java.lang.Object)
        at sun.security.provider.NativePRNG.engineNextBytes(java.base@11.0.17/NativePRNG.java:220)
        at java.security.SecureRandom.nextBytes(java.base@11.0.17/SecureRandom.java:751)
        at com.atlassian.security.random.DefaultSecureRandomService.nextBytes(DefaultSecureRandomService.java:46)
        at com.atlassian.security.random.DefaultSecureTokenGenerator.generateToken(DefaultSecureTokenGenerator.java:47)
        at com.atlassian.jira.security.xsrf.XsrfTokenStrategy.createToken(XsrfTokenStrategy.java:87)
...

These thread holding lock 'c0270 there is:

"DelayedLoginStore:thread-0" #573 daemon prio=5 os_prio=0 cpu=126392.47ms elapsed=1031762.88s allocated=18315M defined_classes=19 tid=0x00007f6d2b25a800 nid=0x429cb waiting for monitor entry  [0x00007f6ca24a4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at sun.security.provider.NativePRNG$RandomIO.implNextBytes(java.base@11.0.17/NativePRNG.java:544)
        - locked <0x0000000500fc0270> (a java.lang.Object)
        at sun.security.provider.NativePRNG.engineNextBytes(java.base@11.0.17/NativePRNG.java:220)
        at java.security.SecureRandom.nextBytes(java.base@11.0.17/SecureRandom.java:751)
        at java.util.UUID.randomUUID(java.base@11.0.17/UUID.java:150)
        at com.atlassian.jira.instrumentation.DefaultInstrumentationListenerManager.startContext(DefaultInstrumentationListenerManager.java:149)
...


This is with OpenJDK version "11.0.17" 2022-10-18".

What went wrong?

This appears to be an instance of a fairly notorious problem where apps exhaust the entropy of a system, expressed as availability of bytes from /dev/random, which per the Javadocs for NativePRNG.java), is the "default seed source".

I'm not convinced though, because my threads don't contain a FileInputStream.readBytes()  call, as would be the case if /dev/[u]random  was really being read, as shown in this example of /dev/random blocking in Bamboo.  If anyone has suggestions, please comment.

The fix

Stackoverflow to the rescue. Per the advice there for Java 11, I now set:

JVM_SUPPORT_RECOMMENDED_ARGS="-Djava.security.egd=file:/dev/./urandom"

In Jira and Confluence's bin/setenv.sh  

If you set this (or any system-wide) flag, document why and when. The flag might be obsoleted in future, or even cause trouble if left when no longer needed.

  • No labels