Analysis of out-of-memory Errors
Recently I was called to investigate an out of memory error in a customer relationship management suite. This java based CRM application was hosted on a virtual linux server and stored its data in a MySQL database. User requests flow through two presentations and two business nodes. An admin server component was responsible for batch processing and cache refreshing.
On a regular business day, several slowdowns occurred and the admin server log files contained out of memory error messages. This problem occurred almost every week. After the restart of this admin server the CRM suite was working fine without any issues until another five to seven days later the same issue occurred.
According to the symptoms of those errors this sounds very much like a classic memory leak because initially the system was working fine and after a while it starts slowing down and resulted finally in an out of memory exceptions.
My first checkpoint was the JVM configuration. This admin server node used a max heap size of 7 GB, a NewSize of 1 GB and a MaxNewSize of also 1 GB. The survivor ration was set to 5.
My second checkpoint was the actual heap utilization. Old generation provided plenty of free space, there were never more than 3 GB old generation space used. Eden space was also enough available. In the survivor space, there were some spikes which touched the max size in this area. This is a clear indicator that the sizing of this JVM was not correct.
We removed NewSize, MaxNewSize and Survivor Ratio completely and since this change the application is running fine without any further slowdown or out of memory error.
Lessons learned It’s not always a memory leak Check the JVM configuration before you start memory analysis
Oracle provides a quick overview and more details on different JVM configuration option on the link below.