Im currently setting up a new MongoDB ops manager machine. Installation works fine but I can’t start the mongodb-mms service. The starting of Instance 0 fails with an java.lang.OutOfMemoryError exception. I use the same configuration as on my test server (2 CPU cores, 8gb ram), there the service starts without any interrupt. Changing the ulimit configuration / starting the service with root user has no effect.
New Server specs:
- 10 Vcores at 2.0Ghz
- 48gb Ram
- 800gb storage
- Ubuntu 18.04 LTS 64bit
Since the new server is shared with others is it possible that the host limited the cpu usage per user?
mms0.log:
[Starting Logging - App Version: 4.2.23.57072.20210126T1748Z] 2021-03-28T19:32:11.682+0000 [main] INFO com.xgen.svc.mms.dao.mongo.MongoSvcUriImpl [MongoSvcUriImpl.java.initMorphiaMapper:154] - Initialized Morphia in 12538ms 2021-03-28T19:32:12.319+0000 [main] INFO com.xgen.svc.mms.dao.mongo.MongoSvcUriImpl [MongoSvcUriImpl.java.<init>:89] - Created MongoSvc with 1 client(s) [Starting Logging - App Version: 4.2.23.57072.20210126T1748Z] 2021-03-28T19:33:07.998+0000 [main] INFO com.xgen.svc.core.ServerMain [ServerMain.java.doPreFlightCheck:295] - Starting pre-flight checks 2021-03-28T19:33:20.990+0000 [main] INFO com.xgen.svc.mms.dao.mongo.MongoSvcUriImpl [MongoSvcUriImpl.java.initMorphiaMapper:154] - Initialized Morphia in 12920ms 2021-03-28T19:33:21.555+0000 [main] INFO com.xgen.svc.mms.dao.mongo.MongoSvcUriImpl [MongoSvcUriImpl.java.<init>:89] - Created MongoSvc with 1 client(s) 2021-03-28T19:33:22.983+0000 [main] INFO com.xgen.svc.core.ServerMain [ServerMain.java.doPreFlightCheck:301] - Successfully finished pre-flight checks 2021-03-28T19:33:22.984+0000 [main] INFO com.xgen.svc.core.ServerMain [ServerMain.java.start:308] - Starting mms... 2021-03-28T19:33:23.142+0000 [main] INFO com.xgen.svc.core.ServerMain [ServerMain.java.createNonSSLConnector:843] - Creating HTTP listener on *:8080 2021-03-28T19:33:23.438+0000 [main] ERROR com.xgen.svc.core.ServerMain [ServerMain.java.main:226] - Cannot start mms server [FATAL-EXITING] - instance: 0 - msg: unable to create native thread: possibly out of memory or process/resource limits reached java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.base/java.lang.Thread.start0(Native Method) at java.base/java.lang.Thread.start(Thread.java:803) at org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:660) at org.eclipse.jetty.util.thread.QueuedThreadPool.ensureThreads(QueuedThreadPool.java:642) at org.eclipse.jetty.util.thread.QueuedThreadPool.doStart(QueuedThreadPool.java:182) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) at org.eclipse.jetty.server.Server.start(Server.java:423) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) at org.eclipse.jetty.server.Server.doStart(Server.java:387) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) at com.xgen.svc.core.ServerMain.start(ServerMain.java:424) at com.xgen.svc.core.ServerMain.main(ServerMain.java:221)
mms0-startup.log
[23,180s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 512k, guardsize: 0k, detached. OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release. [19,947s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 512k, guardsize: 0k, detached. Cannot start mms server [FATAL-EXITING] - instance: 0 - msg: unable to create native thread: possibly out of memory or process/resource limits reached java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.base/java.lang.Thread.start0(Native Method) at java.base/java.lang.Thread.start(Thread.java:803) at org.eclipse.jetty.util.thread.QueuedThreadPool.startThread(QueuedThreadPool.java:660) at org.eclipse.jetty.util.thread.QueuedThreadPool.ensureThreads(QueuedThreadPool.java:642) at org.eclipse.jetty.util.thread.QueuedThreadPool.doStart(QueuedThreadPool.java:182) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) at org.eclipse.jetty.server.Server.start(Server.java:423) at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) at org.eclipse.jetty.server.Server.doStart(Server.java:387) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) at com.xgen.svc.core.ServerMain.start(ServerMain.java:424) at com.xgen.svc.core.ServerMain.main(ServerMain.java:221)
ulimit -a
core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1544321 max locked memory (kbytes, -l) 65536 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 62987 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Advertisement
Answer
SUGGESTION: focus on your JVM;
- Ensure you have a 64-bit version of Java
- Try tuning your JVM parameters:
https://docs.opsmanager.mongodb.com/current/reference/troubleshooting/system/
Open mms.conf in your preferred text editor.
Find this line:
JAVA_MMS_UI_OPTS="${JAVA_MMS_UI_OPTS} -Xss228k -Xmx4352m -Xms4352m -XX:NewSize=600m -Xmn1500m -XX:ReservedCodeCacheSize=128m -XX:-OmitStackTraceInFastThrow"Change the -Xmx and -Xms values to a larger value. Both parameters should be set to the same value to remove any performance impact from the VM constantly reclaiming memory from the heap.
The value is specified as #k|m|g: a number followed by
k (kilobytes), m (megabytes), or g (gigabytes)
By default, Xmsx and Xms are both set to 4,352 MB (4352m).
EXAMPLE: To set the Java heap to 10 GB, set this value to:
-Xmx10g -Xms10g
STRONG SUGGESTION: I would continue to focus on JVM settings, However, this link might also be relevant:
https://stackoverflow.com/a/31445282/421195
I encountered a similar issue in our Test Ops Manager deployment when we upgraded to Ops Manager 1.8.0. I ultimately opened up a ticket with MongoDB Support and this was the resolution for our issue:
The Ops Manager components are launched using the default username “mongodb-mms”. Please adjust the ulimit settings for this user to match those of the “mongodb” user, currently defined in
/etc/security/limits.d/99-mongodb-mms-automation-agent.conf
.You may wish to add a separate file under
/etc/security/limits.d/
for the mongodb-mms user.More information can be found here.
New information:
So I tried a fresh install with the same version of MongoDB (4.4.3) and Ops Manager(4.4.8.100) to check if something was wrong with the newest versions. Throws the same error.
I tried running jconsole -debug ->
[1,323s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached
This suggests you might be running out of threads.
Relevant links:
https://github.com/elastic/elasticsearch/issues/31982
Elasticsearch version (bin/elasticsearch –version): 6.3.1
JVM version (java -version):10
OS version:centos
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached but my os has free 80g memory
i used docker.elastic.co/elasticsearch/elasticsearch:6.3.1,
jvm config: -Xms32g -Xmx32g…
[I had] a similar (but likely unrelated) issue in our app (which is using the ES client). For whatever reason, it had gone berserk during the weekend, spawning 9400 threads which made the machine fail in new thread creation for the same user account.
ps -o nlwp,pid -fe
helped me spot this, so I could kill the bad process and get the system back to a usable state. Greatly appreciated!
Here is an example ps -o nlwp,pid -fe
from my Ubuntu system (an AWS VM). I suspect your “ps” will look very, very different:
# ps -o nlwp,pid -fe NLWP PID 1 13409 1 13410 1 13418 1 915 1 911
Addendum:
I switched the OS (from Ubuntu 18.04 LTS 64bit) to CentOS 8 and now its working perfectly.