Middleware: HPUX IA64 JVM issue with Weblogic

I thought, it would be interesting to add this as a post for Oracle Fusion Middleware application. Recently I was asked to help a large healthcare company that was having issue with processing the messages those are being generated by OSB for claims. They had a separate weblogic clustered domain for handling of messages only and OSB was on a separate weblogic domain.

Issue: 1- The JVM was doing Garbage Collection every 10 to 15 seconds and spiking the CPU to grinding halt.
2- The thread dumps revealed that the reflect class unloading is happening on a Full GC - see below:

sun.reflect.GeneratedSerializationConstructorAccessor

3- The message file size exception occurred as the maximum file size limit was set to 30MB and the messages were more than this size

4- The full Garbage collection was taking more time and JVM pause are seen in the thread dump.

Environment: HPUX 11iV3 with itanium 64 bit

Solution:

This is how it got resolved - tweaked the JAVA memory and mapping the memory page separately for each call with increasing the stacksize and applying some missed libraries and Patches.

The issue is that on HPUX Itanium 64 bit - the java runs in a 32 bit mode by default unless you ask it to run on 64 bit mode. This can be verified by "java -version" and the "java -d64 -version" and then by "java -d32 -version". However if the libraries exist and the kernel patches are installed then Weblogic makes a call and understand that there is 64 bit installed and will add these flages "-client -d64" but in some cases it does not do thatand we need to add.

Also the JDK that gets installed on HPUX is not Oracle JDK but Oracle JDK ported by HP on HPUX and hence the version 6.0.10 is same as oracle JDK 1.6 update 24 and the latest oracle JDK 1.6 update 29 is in 6.0.13. Oracle JDK 1.7 is just release in December 2011 and Version 7.0.00 is Oracle JDK 7.0u1 .

Besides all of the above - to get all the huge piles of messages processed the JAVA tuning is needed as well and after careful consideration, below is the solution.

Here are the Steps:

1- Go to the Web site

     http://hpux.connect.org.uk/

Use the search button to find the following libraries:



libiconv-1.14

libxml2-2.7.8

libxslt-1.1.26

zlib-1.2.5

2-- Make sure that you have all of the following patches installed.

PHSS_37501

PHCO_38050

PHSS_38139

PHKL_40208

PHKL_35552

3- Make sure that following HPUX kernel parameters have following value at the least.

max_thread_proc 1024

maxfiles 256

nkthread 3635

nproc 2068

4- The JVM by default runs on "client" mode - its okay but when the volume is large - it need to be changed to the server mode. To do this, I have added the following line in startWeblogic.sh (This is if you are running in a development mode).

JAVA_VM=-server

Export JAVA_VM

5- Make the changes in setDomainEnv.sh and make sure that these MEM arguments take effect – I mean if you are setting something in startManagedServer.sh or somewhere else then it need to be modified there.

If your system is not enabled with NUMA then:

if [ "${SERVER_NAME}" = "AdminServer" ] ; then

USER_MEM_ARGS="-d64 -Xmpas:on -Xss1024k -Xms4096m -Xmx4096m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC"

else

USER_MEM_ARGS="-Xmpas:on –Xss2048k –Xmx8g –Xmn6g –Xingc -XX:+ForceMmapReserved –XX:PermSize=2g –XX:MaxPermSize=2g –XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=3 –XX:NewRatio=4 –XXCMSTriggerRatio=50 -XX:+UseCompressedOops”

If your system is enabled with NUMA then:

if [ "${SERVER_NAME}" = "AdminServer" ] ; then

USER_MEM_ARGS="-d64 -Xmpas:on -Xss1024k -Xms4096m -Xmx4096m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC"

else

USER_MEM_ARGS="-Xmpas:on –Xss2048k –Xmx8g –Xmn6g –Xingc -XX:+ForceMmapReserved –XX:PermSize=2g –XX:MaxPermSize=2g –XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=6 – -XX:+UseCompressedOops -XX:+UseNUMA -XX:-UseLargePages”

fi

[The difference is that when NUMA is enabled the I/O to memory is not an issue and I let the Full GC at 92% as it is not going to impact. Also note the thread - I changed it to 6 as I have a 8 CPU machine with NUMA so I keep 2 CPU free whereas in the non-NUMA case I only had 4 CPUs]

Here you can add other conditions to have each managed server different memory setting as per need basis.

Explanation: -Xmn (New Size) need to be set in HPUX environments – If not set then it is 1/3 the value of the Xmx. So I changed it to go 6G to make the New size bigger to avoid out of room and less frequent GC.

Not setting the Xms option and replacing it with -XX:+ForceMmapReserved is more efficient than asking the JVM to allocate pages. This way the OS MMAP reserves the pages.

-- More details from HP document:

-XX:+ForceMmapReserved

Tells the JVM to reserve the swap space for all large memory regions used by the JVM

va™ heap). This effectively removes the MAP_NORESERVE flag from the mmap call
used to map the Java™ heaps and ensures that swap is reserved for the full memory
mapped region when it is first created. When using this option the JVM no longer needs
to touch the memory pages within the committed area to reserve the swap and as a
result , no physical memory is allocated until the page is actually used by the application

Adding the ParallelGC thread – to make sure to change the default behavior. Bt default it is equivalent to the number of processor. SO I want to make CPU available while GC is going on – as I remember correctly that we only have 4 CPU.

Adding NewRatio and making it to 4 is also to change the default behavior which is a ration of new to old generation and by default it is 1:8 and that seems to be small for this setup at CIGNA and hence increasing it to make a bigger size – Now the GC will not run too often as it is now 1:4 ratio.

CMSTriggerRatio is being suggested at 50 percent – so there will always be a heap size available when the Full GC is in progress. This is the ratio between free to non-free heap. So up to 4Gig of space will be available when Full GC starts and will only run on 2 CPUs

Finally -XX:+UseCompressedOops – directing JVM to save memory by using 32bit pointers whenever possible and hence use less memory.

-Xingc = Use incremental GC that means run the GC on unused memory in the concurrent mark sweep generation.

6- Added following to the JAVA_OPTIONS of setDomainEnv.sh file:

“-Dsun.reflect.noInflation=true”

sometime the environment can be tricky so alternate to setDomainEnv.sh would be to add to startWeblogic.sh

7- Implement the message size increase across the Weblogic Server by setting following (110MB)

-Dweblogic.MaxMessageSize=115343360

8- In other case of OSB on IA64 I also end up adding following options.

-XX:+UseNUMA -XX:-UseLargePages

Explanation: When NUMA is enabled the LargePages are enabled by default – so we want to use NUMA power but disabling the large pages.

More Details on NUMA from HP Document:

Starting in JDK 6.0.06, the Parallel Scavenger garbage collector has been extended to take advantage of the machines with NUMA (Non Uniform Memory Access) architecture. Most modern computers are based on NUMA architecture, in which it takes a different amount of time to access different parts of memory. Typically, every processor in the system has a local memory that provides low access latency and high bandwidth, and remote memory that is considerably slower to access.

In the Java HotSpot Virtual Machine, the NUMA-aware allocator has been implemented to take advantage of such systems and provide automatic memory placement optimizations for Java applications. The allocator controls the eden space of the young generation of the heap, where most of the new objects are created. The allocator divides the space into regions each of which is placed in the memory of a specific node. The allocator relies on a hypothesis that a thread that allocates the object will be the most likely to use the object. To ensure the fastest access to the new object, the allocator places it in the region local to the allocating thread. The regions can be dynamically resized to reflect the allocation rate of the application threads running on different nodes. That makes it possible to increase performance even of single-threaded applications. In addition, "from" and "to" survivor spaces of the young generation, the old generation, and the permanent generation have page interleaving turned on for them. This ensures that all threads have equal access latencies to these spaces on average.

The NUMA-aware allocator can be turned on with the -XX:+UseNUMA flag in
conjunction with the selection of the Parallel Scavenger garbage collector. The Parallel Scavenger garbage collector is the default for a server-class machine. The Parallel Scavenger garbage collector can also be turned on explicitly by specifying the -XX:+UseParallelGC option.

Applications that create a large amount of thread-specific data are likely to benefit most from UseNUMA. For example, the SPECjbb2005 benchmark improves by about 25% on NUMA-aware IA-64 systems. Some applications might require a larger heap, and especially a larger young generation, to see benefit from UseNUMA, because of the division of eden space as described above. Use -Xmx, -Xms, and -Xmn to increase the overall heap and young generation sizes, respectively. There are some applications that ultimately do not benefit because of their heap-usage patterns.

Specifying UseNUMA also enables UseLargePages by default. UseLargePages can
have the side effect of consuming more address space, because of the stronger alignment of memory regions. This means that in environments where memory is tight but a large Java heap is specified, UseLargePages might require the heap size to be reduced, or Java will fail to start up. If this occurs when UseNUMA is specified, you can disable UseLargePages on your command line and still use UseNUMA; for example:
-XX:+UseNUMA -XX:-UseLargePages.

Please note that after applying the OS patches and libraries - it may not require the "-d64" flag. Please check with console log and see if this flag is getting added twice - in that case just remove it from above.

Happy troubleshooting !!

Middleware

Friday, December 30, 2011

HPUX IA64 JVM issue with Weblogic

No comments:

Post a Comment