Go2Oracle

Friday, May 10, 2013

SOA Suite Knowledge – Polyglot Service Implementation with Groovy

Polyglot programming, defined most simply, is the use of multiple languages in software development. Implementing services on the SCA Container is already a intended polyglot development approach. Oracle SOA Suite have the SCA service implementation types of BPEL, Human Workflow, Mediator, Rule and Spring Components. These components are mixing the General-Purpose Language (GPL) Java with Domain-Specific Languages (DSL) like XSD, WSDL or Schematron. But Spring Components are also enabler of service implementations with other JVM GPL languages beside Java.

In my opinion Neal Ford was absolutely right when he coined the term “Polyglot Programming” and predicts “Applications of the future will take advantage of the polyglot nature of the language world … The times of writing an application in a single general purpose language is over” already in 2006. In order to produce polyglot implementations you need environments where polyglot-ism is required or at least encouraged. Oracle SOA Suite is such a polyglot supporting environment and you are doing polyglot development all the time (e.g. XML, WSDL, SQL, Java). But also on the GPL side the developers or not limited on Java. SCA Spring Components are supported since Patch Set 2 and the Spring Framework supports dynamic languages since version 2 (BeanShell, JRuby and Groovy).

Groovy is a General-Purpose Dynamic Language and has the best and seamless integration with Java (beside Spring integration Groovy supports also direct and JSR 223 integration). Therefore, using Groovy has a low threshold for Java developers; it is easy because Groovy has a Java-like syntax. Most Java code is also syntactically valid Groovy. So it’s an iterative learning approach for Java developers switching to Groovy. My first contact with Groovy was at the JAX conference in 2008. It was a Groovy newbie session from Dierk König (well-known book author of "Groovy in Action"). And for the first time in a long while of Java programming I had an awakening and still addicting interest on this development language.

The Groovy language is meant to complement Java and is adding a wide range of features that are sadly lacking in Java (for example Closures, Dynamic Typing and Metaprogramming – just to name a few). Concurrency and parallelism is increasingly important in the era of multi-core CPUs with a growing number of CPU cores. GPars is part of Groovy and offers a intuitive and safe way to handle tasks concurrently. Groovy makes it also easy to create DSLs (to simplify the “solution”). Optional parentheses, optional semicolons, method pointers and Metaprogramming let the code viewed as “executable pseudo code” (and easy readable by non-programmers). One of my personal favorites on using Groovy is the easy XML handling, both for consuming and producing XML content with its XML builders and parsers. Therefore I encourage you to take a deeper look on Groovy.

In my opinion Groovy envision also a bright future because the Spring development team announced to put a strong focus on Groovy on the upcoming Spring Framework 4.0. Spring 4 will about properly supporting Groovy 2 as a first-class implementation language for Spring-based application architectures - not just for selected scripted beans but rather for an entire application codebase, as a direct alternative to the standard Java language (quoting on Jürgen Höller from SpringSource who is the Spring Framework co-founder and project leader). The Spring Framework is the most popular application development framework for enterprise Java and will become therefore the driving force for getting more Java developers in touch with Groovy (and let them feel what is the “right language for the job”).

The latest version of Oracle SOA Suite (v11.1.1.7) comes with Spring v2.5.6 and has also the Groovy v1.6.3. libraries on board. These versions are outdated, because Spring v.2.5.6 was released in November 2008 and Spring is already on the 3.x version. Groovy v1.6.3 was released in Mai 2009 and Groovy is today on version 2.x. Anyway, since Oracle SOA Suite PS5 (11.1.1.6) it’s possible to do Spring Bean implementation using Groovy. Oracle itself is also using Groovy, for example the Rule Reporter is written in Groovy or Oracle ADF is using Groovy as well. But the official documentation on how to write SOA Suite Spring Beans with Groovy should be improved, because you need more details to make our polyglot implementation running. Motivating for using Groovy and showing the details are the reasons for this blog. Now lets go to the details.

First you have to do some configure steps. I show you the steps with the out-of-the-box coming Groovy library v1.6.3. But you also could download the latest Groovy version and make use of the newest Groovy features (I did a successful test with Groovy v2.1.3).

1.) Copy the $MW_HOME/oracle_common/modules/groovy-all-1.6.3.jar library to the $ORACLE_HOME/soa/modules/oracle.soa.ext_11.1.1 folder

You can add JAR files and custom classes to a SOA composite application. A SOA extension library for adding extension JARSs and classes to a SOA composite application is available in the $ORACLE_HOME/soa/modules/oracle.soa.ext_11.1.1 directory. The Groovy library has become known to SOA composite applications.

2.) Run ANT at the $ORACLE_HOME/soa/modules/oracle.soa.ext_11.1.1 folder

Running ANT on the "soa.ext" folder will update the oracle.soa.ext.jar library with a changed MANIFEST.MF file, adding a reference to Groovy.

Warning: This procedure is not cluster-friendly. If you're running a cluster, you must copy the file and run ANT on each physical server that forms the domain.

3.) Add the Groovy library to the Weblogic system classpath. Open the setDomainEnv configuration file and put the Groovy library on the POST_CLASSPATH setting.

4.) Restart the Weblogic application server

Once the configuration is done, the Spring Bean Groovy coding could start. I’m reusing the Calculator example from an older blog about Spring Bean Contract-First. Therefore I copy the Calculator WSDL on the project folder, create a Web Service Adapter and pointing on the Calculator WSDL file. Afterwards I create an Spring Bean component and wire the Spring Bean with the Web Service Adapter (contract-first approach). The JDeveloper wizard creates the necessary Java files for the Spring Bean.

I’m doing only one change on the setter methods of the generated type classes for the response (AddResponseType, SubtractResponseType, MultiplyResponseType and DivideResponseType). The setter should return “this”. This approach is commonly called a builder pattern or a fluent interface. You don’t need to care about the returned object if you don't want, so it doesn't change the "normal" setter usage syntax. But this approach allows you also to return the changed class (I’m using this approach together with the end return Groovy feature) and to chain setters together.

Next step is the Groovy implementation of the generated Java ICalculatorService interface. I’m placing the GroovyCalculator class on the same package name like the Java interface but on the <project>\SCA-INF\src folder instead of the <project>\src folder (details below).

The implementation is coming in Groovy style. Semicolons and the default public visibility declaration disappeared. Variables are dynamically typed (keyword def). The mathematical operations are Closures (support for Closures in Java is planned for Java 8; recently postponed to March 2014). The last expression result will be returned (so-called end return), therefore the keyword return is in Groovy optional. In general, Groovy code is less verbose, and more expressive and compact.

You have to make sure that the Groovy class is coming on the SCA-INF/classes folder at the SOA archive (SAR) JAR file. Therefore you have to place the Groovy class file on the SCA-INF/src sub-folder and instruct JDeveloper to copy also files with groovy extension on the Output Directory (Project Properties –> Compiler –> Copy File Types to Output Directory).

The final step involves defining dynamic-language-backed bean definition on the Spring Bean Context file (the wiring-XML-file). One for each bean that you want to configure (this is no different to normal Java bean configuration). However, instead of specifying the fully qualified classname of the class that is to be instantiated and configured by the container, you use the <lang:language/> element to define the dynamic language-backed bean. For Groovy you have to use the <lang:groovy/> element (which instructs Spring to use the specific Groovy-Factory).

Afterwards the SCA Composite could be deployed and tested. The Groovy script will be automatically compiled during the deployment on the application server. It is worth to mention that Spring supports also refreshable beans and inline scripting.

Refreshable beans allow code changes without the need to shut down a running service or redeploy the service. The dynamic-language-backed bean so amended will pick up the new state and logic from the changed dynamic language source file. Be careful, in my opinion it’s a powerful but dangerous feature. One reason for inline scripting could be a quick Mock Bean implementation (take a look on the Mock example below).

The Mock implementation is always returning 42 as a result.

Groovy coding errors will result in a ScriptCompilationException giving you more details about the occurred issue (reason, line, column). For example …

There was an error deploying the composite on SOAServer: Error occurred during deployment of component: CalculatorSB to service engine: implementation.spring, for composite: Calculator05Groovy: SCA Engine deployment failure.: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'GroovyCalculatorBean': BeanPostProcessor before instantiation of bean failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'GroovyCalculatorBean': Could not determine scripted object type for GroovyScriptFactory: script source locator [classpath:calculator/example/suchier/GroovyCalculator.groovy]; nested exception is org.springframework.scripting.ScriptCompilationException: Could not compile script; nested exception is org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed, GroovyCalculator: 17: unexpected token: this @ line 17, column 84.

Often I’m using SoapUI beside Enterprise Manager Fusion Control for doing first tests. Here is the test SoapUI output for a valid division.

This is the output for a division by zero. The exception is throw on the Groovy class, but is also a standard based SOAP fault.

The source code is available as 7-zip here.

That’s it. This is not a bash-the-other-languages blog. But my personal favorite language beside Java is Groovy and there many reasons to take a look on Groovy also as a SOA Suite developer. Now you have the details to start Groovy based polyglot service implementation on Oracle SOA Suite. If you are a Groovy beginner, I would recommend to take a look at the excellent books “Groovy in Action” (from Dierk Koenig) and “Groovy Recipes” (from Scott Davis).

Monday, March 4, 2013

SOA Suite Knowledge – MTOM enabled Web Services

You might have heard about MTOM/XOP. Possibly you have a broad idea of what MTOM/XOP is doing. Than you start from the same initial situation like me some months ago. Let me give you a jump start on what MTOM/XOP means in general and in particular on the Oracle platform.

MTOM is the W3C standardized Message Transmission Optimization Mechanism which is used in combination with the concrete referencing mechanism XOP which stands for XML-binary Optimized Packaging. MTOM/XOP is a way to handle opaque data with XML like binary data of images, audio or video. MTOM/XOP does not use any compression and decompression technique. There is another standard called Fast Infoset that addresses the compression issue. But large application data which need to be compressed for efficacy could be also handled by MTOM/XOP (more about it on the given example below).

XML parsers will fail when you place binary data in a text node of a XML document. The binary data could contain reserved XML characters like ‘<’, ‘>’ or ‘&’ which will break the parser on the “other side”. CDATA wrapping allows these reserved XML characters but the parser is looking for a ‘]]>’ (which marks a CDATA section end) and there is the risk that the binary data have such a byte sequence. Base 64 encoding (xsd:base64Binary) is a solution because each byte is mapped to a valid 64 character “alphabet” which guarantees that a byte is not misinterpreted as anything that would break an XML parser. Unfortunately, base 64 encoded binary data are on average 33% larger (number of bytes) than the raw data. The alternative of hexadecimal encoding (xsd:hexBinary) even expands the binary data by a factor of 2.

Another idea to handle opaque data is to place them outside of the SOAP message. This technique is well-known in the Java world as “SOAP with Attachments” (SwA). The SOAP message contains a reference to the binary data which are placed at the Multipurpose Internet Mail Extension (MIME) attachment.

But the binary attachment is not part of the SOAP Envelope at all. It’s leaving it up to the message processor to retrieve the actual binary data. Standards like WS-Security couldn’t work anymore, because WS-Security can’t be used to secure attachments. Furthermore the Microsoft world was using DIME (Direct Internet Message Encapsulation) encoded attachments. It’s not a good idea that Web Service providers have to support both attachment technologies, SwA and DIME.

Since 2005 the agreed solution is MTOM/XOP. It has the efficiency of SwA, but it does so without having to break the binary data outside of the SOAP Envelope. XOP allows binary data to be included as part of the XML Infoset. When data is within the XML Infoset, it means that it operates within the constraints of the SOAP processing model. In other words, you can control the processing of the XML Infoset using SOAP headers with policies.

MTOM/XOP uses a XOP:include element to tell the processor to replace the content with the referenced binary data which are “stored” outside of the XML document at the MIME attachment part. The logic of discrete storage and retrieval of binary data becomes inherent to the XML parser. The SOAP parser could treat the binary data in a uniform way and as part of the XML document, with no special retrieval logic. Fortunately sender and receiver are not aware that any MIME artifacts were used to transport the base64Binary or hexBinary data.

Let’s take a look on how to enable MTOM/XOP on SOA Suite 11g. The real world requirement is to build a cache warming-up dictionary service for the Web Presentation Tier which has no database access. Therefore the SCA Container running on the Service Tier provides a Dictionary Entity Service which should be able to transport large datasets to the Web Presentation Tier intermediated by an OSB.

A first top-down WSDL-first implementation would cause the client to send a request for Dictionary ID ranges (request example); afterwards the service retrieves the requested dictionary data from the database and sends the response back to the client as normal dictionary entry structure (response example).

Request Example:

Response Example:

It’s obviously that a large range of Dictionary IDs would cause a large SOAP Response message. But in our case we have a Spring Bean implementation using the simple JDBCTemplate approach. The result of the database request is a list of maps (line 11 and 12) which is mapped on the service response (line 32 to 38).

How about that the service return the raw list instead of the collection of dictionary entry type? Isn’t it a technical motivated private service where the known Java client on the Web Presentation Tier has to navigate through the list anyway? Yes, it is.

The service should become MTOM enabled in order to handle the binary data as part of the message. The same request structure should return a byte array with the data list returned from the database. Two meta data information should be returned additionally, how much dictionary entries are returned and the byte size of the array.

The XSD structure is simple. Element Size and RecNum are optional and of type long. And here we use the base64Binary type for the list of dictionary entries (Element return).

MTOM will be enabled with a right mouse click on the Web Service Adapter placed at the Exposed Services swim lane. The menu option “Configure WS Policies …” will open the Web Service Policies configuration wizard. You only have to enable the MTOM policy (oracle/wsmtom_policy).

The Web Service Adapter on the graphical SCA Editor will have a lock icon at the right upper corner after adding the policy. Behind the scene the wizard will add only one additional configuration line on the Web Service binding (line 4) at the composite.xml file.

When you deploy the MTOM enabled service and take a look at the WSDL URL you will realize that Oracle SOA Suite added a MTOM policy to the WSDL (line 6 to 8). Additionally the WSDL binding has a policy reference to the MTOM policy (line 100). The wsdl:required="false" declaration means that also not MTOM enabled clients should work. Tools such as soapUI or SoapSonar should be able to invoke the service which makes sense because the service offers operations without MTOM involvement. From my experiences with soapUI it’s not working because both MTOM request properties (“Enable MTOM” and “Force MTOM”) have to be enabled always. Otherwise the service responses with an internal sever error (“Non-MTOM message has been received when MTOM message was expected.”).

Oracle SOA Suite will create a byte[] equivalence for the XSD type base64Binary. So byte[] is the object representation of XSD types base64Binary and hexBinary. The only need is to convert the Spring database result list to a byte array (by using the classes ObjectOutputStream and ByteArrayOutputStream inside of a DataMTOMUtil helper class at line 34). The two meta data values are quickly and easily determined (line 37 and 38).

The same request will now result in a more interesting response containing the XOP:include.

So the two meta data information of size and record numbers are like expected. But the dictionary result is now placed with context type “application/octet-stream” (=binary attachment) at the MIME part attachment with a given content ID. soapUI is showing you a raw view on the complete SOAP message.

As you could see, the HTTP message has the content type “multipart/related” which is used for compound documents. The type is “application/xop+xml”. The dictionary list is send as binary part of the MIME message with the content type “application/octet-stream” and the same content ID which is placed at the href attribute (cid:6df6245c6aff457d8d51db92b6707170).

You will realize an interesting effect when the binary data are below a size of 1024 byte. In this case the Web Service framework will not create a separate attachment. Instead the binary data are placed directly on the XML message.

The MTOM threshold of 1024 byte is a default value. When the potential attachment bytes are less than that threshold size, the data will be inlined. The MTOM threshold determines the break-even point between sending data inline or as attachment.

I already mentioned that MTOM/XOP will not do any kind of data compression. But it’s obvious that data compression will have a great benefit on our solution. The objToByteArray method is using standard OutputStream classes. It’s a quick implementation to use the java.util.zip.GZIPOutputStream for additional compression (line 10).

The standard package java.util.zip provides classes for reading and writing standard GZIP format which is a popular compression algorithm. The client side creation is simply using the JDeveloper Web Service Proxy wizard. The wizard creates a Web Service client skeleton which is also MTOM/XOP enabled.

Regarding MTOM/XOP you only have to enable the “oracle/wsmtom_policy” during the Web Service client creation.

Finally you place your code snippets at the created Web Service client skeleton. In our case the conversation of a byte array back to a List object (line 31) and the mapping to the dictionary structure (line 33 to 38). The code has also to consider the GZIP decompression (line 7).

What is the effect of data compression? I executed the service with several dictionary ranges and put the values on a spreadsheet. Additionally I added the size for base64 encoding (which would be used without MTOM/XOP).

Dictionary Rec Num	Dictionary List Size Uncompressed (byte)	Dictionary List Size GZip Compressed (byte)	Dictionary List Size Uncompressed Base64 (byte)	Dictionary List Size Compressed Base64 (byte)
1	646	493	864	660
14	2706	740	3608	988
194	36383	3737	48512	4984
1783	307619	13830	410160	18440
11574	2188494	89687	2917992	119584

The compression rate is significant. Take a look at the last line. When I would transport 11574 records without MTOM on the base64Binary element I have to base 64 encoding of the data. In this way I need 2.78 MB byte (2917992 byte) which is 25% more than the original dictionary list size of 2.09 MB (2188494 byte). A GZIP compression will bring down the original dictionary list size at 0.09 MB (89687 byte). This is the final size which travels on the MTOM/XOP MIME attachment which is around 4.1% of the original size.

Therefore service is able to transport much more data in a given time with the GZIP compression. Base64 encoding will add additional size and is not a good option (even when you compress the dictionary list because it’s still around 25% larger). So MTOM/XOP with GZIP compression is the best solution.

The solution shown happens in memory. The Oracle documentation describes also a streaming approach which may be interesting in case of very large binary data. But the documentation also mentions limitations regarding message encryption which is not working on streaming MTOM.

Thursday, November 15, 2012

SOA Suite Performance – Right Garbage Collection Tuning - Part 2

My last blog pointed out the importance of the machine power and that it’s impossible to regain performance on the higher architectural layers when the machine power isn’t adequate. This time I want to focus on another very important performance adjusting screw – the JVM and especially Garbage Collection (GC).

GC tuning is in most cases an indispensable prerequisite for good performance on non-trivial projects from a certain size (and Oracle SOA Suite project are designated to this category). Unfortunately, it's also for SOA Suite projects not an easy and one-off task. Most likely it will take some optimization iterations where you measure your GC behavior, tune the GC stetting and measure again. And even when you find the right GC settings for the moment you have to monitor GC behavior over the time because raising number of SCA Composites, more SCA Service calls or higher data volume will change the GC behavior. It’s also safe to keep basic GC logging in the production system.

Good thing about GC tuning is that there are plenty of good articles and blogs describing how to do meaningful GC tuning. I neither want to repeat all the available stuff nor I want to go through all 50+ parameters of the Sun Hotspot JVM. Instead I want to give some helpful GC hints on the JVM which are important during our own JVM tuning for SOA Suite, rarely mentioned or difficult to find. So make yourself familiar with the JVM basics if this didn’t happen so far.

First of all when you want to do JVM tuning you need a GC analyzing tool like HPjmeter to visualize GC behavior. Some tools could perform real-time monitoring but it’s sufficient to offline analyze the GC log files. Raw GC log file analyzing without a tool is a time-consuming task and needs a certain experience level.

Basic GC logging parameters: –verbose:gc –XX:+PrintGCDetails –XX:+PrintGCTimeStamps – Xloggc:<file>.

The –XX:+PrintGCDetails option provides the details needed for GC tuning. The –XX:+PrintGCTimeStamps setting causes that time stamps are printed at GC. The time stamp for a GC event is indicating the amount of elapsed time since the JVM launched (approximately matching with the Weblogic Server Activation Time which is shown on the Weblogic Console). When it is desirable to know the exact wall clock time over a time stamp representing the time since JVM launch, you could use the -XX:+PrintGCDateStamps setting which enables the printing of a localized date and time stamp indicating the current date and time.

The most obvious and important parameter is the right JVM Memory sizing which has to be aligned to the physical memory. Make sure the JVM always has 100% of the memory it needs, do not over-committing memory because the JVM Memory is an active space where objects are constantly being created and garbage collected (memory over-committing is the process of allocating more JVM Memory than there is physical memory on the system). Such an active memory space requires its memory to be available all the time. I was reading about a recommendation not to use more than 80% of the available RAM which is not taken by the operating system or other processes. Too small JVM Heap memory sizing will lead in worst case to an out-of-memory error already during startup or many Full GC Collections from the beginning. Too large JVM Heap memory sizing will pause your application for several tens of seconds. As a general rule of thumb, as the JVM Heap size grows, so does the overhead of GC for the JVM process. In order to give you a ballpark figure, if you have below 50 SCA Composites, I would recommend starting with 4 GB JVM memory (-Xmx4g) on your Weblogic Managed Servers. During your optimization, try to reduce the JVM Heap size if possible to get shorter GC times and to avoid wasting memory. If the JVM Heap always settles to 85% free, you might set the Heap size smaller.

Note: A common misunderstanding is to assume that the –Xmx value is equal to the Java Process memory needed, but the JVM Memory (or Java Process Memory) is greater than JVM Max Heap (greater than –Xmx) and this is due to the other additional JVM areas outside of the JVM Heap that make up the memory space of the total Java process such as JVM Perm Size (-XX:MaxPermSize), [number of concurrent threads]*(-Xss), and the “Other JVM Memory” section. The –Xss option is the Java thread stack size setting. Each thread has its own stack, its own program counter, and its own local variables. The default size is Operating System and JVM dependent, and it can range from 256k to 1024k. Each thread is given exactly the amount specified. There is no starting or maximum stack size. Since the default –Xss stetting is often too high, tuning it down can help save on memory used and given back to the Operating System. Tune down to a range that doesn’t cause StackOverflow errors. The “Other JVM Memory” is additional memory required for JIT Code Cache, NIO buffers, classloaders, Socket Buffers (receive/send), JNI and GC internal info.

Therefore the final JVM Memory calculation has to happen with this formula:

JVM Memory = JVM Max Heap (-Xmx) + JVM Perm Size (-XX:MaxPermSize) + [number of concurrent threads] * (-Xss) + “Other JVM Memory”

Typically “Other JVM Memory” is not significant however can be quite large if the application uses lots of NIO buffers, and socket buffers. Otherwise it’s safe assuming about 5% of the total JVM Process memory.

Check that you have an activation of "-server" JVM Hotspot JIT. It delivers best performance for SOA Suite running servers after a fair amount of time to warm up (keep in mind that the Domain configuration wizard configures “-client” when you create a Domain in development mode). The server mode has differences on the compiler and memory defaults tuned to maximize peak operating speed for long-running server applications (doing more code analysis and complex optimizations). The old rule for Server JVMs to put initial Heap size (-Xms) equal to maximum Heap size (-Xmx) is still valid. Otherwise you get Heap increase on the fly which always requires a Stop-The-World (STW) GC, even if the resizing is very tiny.

Equalization of memory values is also a good approach for the Permanent Space which is allocated outside of the Java Heap. Permanent Space is used for storing meta-data describing classes that are not part of the Java language. The -XX:PermSize setting specifies the initial size that will be allocated during startup of the JVM. If necessary, the JVM will allocate up to -XX:MaxPermSize setting. JVM efficiency is improved by setting PermSize equal to MaxPermSize. This Non-Heap memory area is pretty stable to my observations. Observe PermSpace usage and adjust accordingly by using tools like JConsole or VisualVM.

Keep always in mind that you should consider to scale-out an application to multiple JVMs (=Weblogic Managed Servers) even on the same host (so-called vertical clustering). Horizontal clustering is clustering across hardware boundaries for both load balance and failover as first objective. Even though for a 64 bit system, there is theoretically no upper memory limit but the constraint of available physical memory. But again, too large heap sizes certainly can cause GC STW problems. Smaller JVM heap sizes running on more JVMs is the solution implemented with vertical and horizontal clustering. There is no “gold rule”, optimal JVM heap size and number of JVMs (= Managed Servers) could only be found through performance testing simulating average and peak load scenarios (use tools like the supplied Oracle Enterprise Manager Fusion Control, professional tools like HP LoadRunner or open source tools like Apache JMeter and soapUI).

Most important besides the right JVM memory sizing is the choice of the right GC strategy or also called GC scheme. You have to decide between the optimization goals of maximum throughput and minimum pause time. You couldn’t have both. So if you have a domain doing online processing where users are waiting for quick response you would like to optimize for minimum pause time. On the other hand, if you have a domain doing batch processing and your SCA Services are not involved on an interactive application you could effort longer pause time, than you would like to optimize for maximum throughput (that’s also a reason why I recommend on my last blog a domain split in a scenario were you have to do online processing and batch processing).

The acceptable rate for Garbage Collection is for sure application-specific but the Oracle documentation mentions that a best practice is to tune the time spent doing GC to within 5% of execution time (which is already a high value in my opinion). Full Garbage Collection should in general not take longer than 3 to 5 seconds.

Let’s take a look on the GC scheme settings. Today most of the environments operate on (several) Multi Core/CPUs. I assume that your SOA Suite machine(s) have Multi Core/CPUs and therefore we could neglect the special settings for Single Core/CPU environments. Here are the two important Sun JVM Flags to choose the right GC strategy:

Maximum Throughput
(pause-times are not an issue)

Minimum Pause Time

(pause-times are minimized)

Sun JVM Flag

-XX:+UseParallelGC

-XX:+UseConcMarkSweepGC

Young Generation

(Eden Space + Survivor Spaces)

(Scavenges)

Parallel Young GC

(stop-the-world parallel mark-and-copy)

Parallel Young GC

(stop-the-world parallel mark-and-copy)

Old Generation

(Tenured Space)

Parallel Old GC

(stop-the-world parallel mark-and-compact)

CMS GC

(concurrent mark-and-sweep)

Parallel GC scheme is stopping the application execution and is using as many threads as possible (and therefore all available CPU resources) to clean up memory. GC STW happens. Concurrent GC scheme attempts to minimize the complete halting of the application execution as much as possible by performing the GC logic in parallel within threads that run concurrently with the application logic threads. Anyway, even Concurrent Mark-and-Sweep (CMS) will cause GC STW but less and shorter than a Parallel GC. Take a look at the GC log file, only the “CMS-initial-mark” and “CMS-remark” phase is causing GC STW. The marking and remarking pauses are directly proportional to the amount of objects in the Old Generation (Tenured Space). Longer pauses indicate a lot of tiny objects.

The JVM offers also two settings to control how many GC threads are used in the JVM. The –XX:ParallelGCThreads setting controls the number of threads used in the Parallel GC. The -XX:ConcGCThreads setting let you control the number of threads Concurrent Garbage Collectors will use.

On smaller multiprocessor machines with less than or exactly 8 CPU Cores you will configure the number of parallel threads equal to the CPU Cores.

–XX:ParallelGCThreads = [number of CPU Cores]

For example, if there are two Dual-Core Processors you will have a setting of 4 threads. If there are using 4 Dual-Core Processors or 2 Quad-Core processors you will have a setting of 8 threads.

On medium to large multiprocessor machines don't set the number of GC threads to be the same as the CPU Cores (there are diminishing returns). This is the formula to do the thread configuration for machines with more than 8 CPU Cores.

–XX:ParallelGCThreads = 8 + (([number of CPU Cores] – 8) * 5)/8

You get 1 parallel GC thread per CPU Core for up to 8 CPU Cores. With more CPU Cores the formula reduces the number of threads. For example for 16 CPU Cores you get: 8+((16-8)*5)/8 = 13 GC threads.

The number of threads for the CMS process is dependent on the number of the threads for the parallel GC.

-XX:ConcGCThreads = ([number of ParallelGCThreads] + 3) / 4

But be rather conservative and not too aggressive with the thread setting especially when you are doing vertical clustering. In a virtual environment the calculation is based on the number of CPU Cores assigned to the Guest OS.

Also be aware that CMS leads over the time to some Heap fragmentation which will cause the JVM to switch to a Mark-and-Compact collector. A mix of both small and large objects would fragment the Heap sooner. The JVM needs to find a block of contiguous space for the size of the object and this will slow down the JVM. There is a JVM parameter that could be used to detect fragmentation (-XX:PrintFLSStatistics=2) but it slows down the GC significant. Consider that most likely a SOA Suite Batch Domain has to handle larger objects than an Online Application Processing Domain.

The new Garbage-First (G1) Garbage Collector (testable since Java SE 6 update 14 Early Access Package and officially available since Java SE 7 Update 4) will be the long-term replacement for CMS and targets medium to large multiprocessor machines and large Heap sizes. Unlike CMS, G1 compacts to battle fragmentation and to achieve more-consistent long-term operation. But the first Weblogic Server version which supports JDK 7 is 12c.

When you do the JVM sizing you should know how large the JVM Heap sections are and when GC is triggered. Only with this knowledge you could do meaningful sizing and react on the number given by the GC log files.

These formulas are helpful to calculate the JVM Heap memory sizes:

Space	Calculation
Eden Space	NewSize – ((NewSize / (SurvivorRatio +2)) * 2)
Survivor Space (To)	(NewSize – Eden) / 2
Survivor Space (From)	(NewSize – Eden) / 2
Tenured Space	HeapSize – Young Generation

These formulas give you the real sizes of the generational JVM spaces. The Survivor Spaces serves as the destination of the next copying collection of any living objects in Eden Space and the other Survivor Space. Keep in mind that one Survivor Space is empty at any given time. Now it’s on you to monitor the GC behavior, do the right GC scheme setting, to calculate the optimized heap sizes and number of threads, doing the best non-heap size settings.

Let me show you an example on how successful GC optimization could improve your overall performance on SOA Suite Services running at a Weblogic Domain with two Managed Servers (SOA Suite 11g PS4, WLS 10.3.5, JDK 1.6.0_27 running on Solaris).

The story started with a default ParallelGC scheme setting (default GC values for JDK 6: -XX:+UseParallelGC, ParallelGCThreads=#CPU, SurvivorRatio=32, PermSize=64m, no GC logging) and a JVM heap size of 4 GB.

-server –Xms4g –Xmx4g –Xss512k -XX:PermSize=768m -XX:MaxPermSize=768m –verbose:gc –XX:+PrintGCDetails –XX:+PrintGCTimeStamps – Xloggc:/gc.out

The initial 4 GB JVM Heap size setting was discussed with Oracle and the Permanent Space setting is coming from first observation. The 64bit JDK6 stack size on Solaris is 1024k. We reduced the thread stack size on 512k.

After running some HP LoadRunner Average Load Tests, the HP JMeter GC STW diagram was showing GC STW activities between 23 and 35 seconds.

This was unacceptable for an Online Processing Domain where a user is waiting for response. SCA Services shouldn’t be blocked by 35 seconds freezing the execution. In order to optimize for minimal pause time the GC scheme changed on CMS (–XX:+UseConcMarkSweepGC).

-server –Xms4g –Xmx4g –Xmn2g –Xss512k –XX:PermSize=768m –XX:MaxPermSize=768m –XX:+UseConcMarkSweepGC –XX+UseParNewGC –XX:+CMSParallelRemarkEnabled – XX:+UseCMSInitiatingOccupancyOnly –XX:CMSInitiatingOccupancyFraction=55 –XX:+CMS ClassUnloadingEnabled –XX:ParallelGCThreads=8 –XX:SurvivorRatio=4 –verbose:gc –XX:+PrintGCDetails –XX:+PrintGCTimeStamps – Xloggc:/gc.out

The -Xmn2g setting configures the Young Generation Heap size. A parallel Young Generation collector (–XX+UseParNewGC) is best used with the CMS low pause collector that collects the Tenured Space. The –XX:+CMSParallelRemarkEnabled setting enables multiple parallel threads to participate in the remark phase of CMS. Since this is a STW phase, performance is improved if the collector uses multiple CPU Cores while collecting the Tenured Space. The –XX:CMSInitiatingOccupancyFraction setting on 55 means that CMS GC starts at 55% memory allocation (default is 68%). The – XX:+UseCMSInitiatingOccupancyOnly setting forces CMS to accept the –XX:CMSInitiatingOccupancyFraction setting and not starting the CMS collection before the threshold is reached (disables internal JVM heuristics, without this setting the JVM may not obey CMS initiating occupancy fraction setting). The -XX:+CMSClassUnloadingEnabled setting activates the class unloading option. It helps to decrease the probability of “Java.lang.OutOfMemoryError: PermGen space” errors.

Here is the calculation on the JVM Heap memory size with the given parameter.

Space	Calculation	MB	KB
Eden Space	2048m – ((2048m / (4 +2)) * 2)	1365.3	1398101.33
Survivor Space (To)	(2048m – 1365.3m) / 2	341.35	349525.335
Survivor Space (From)	(2048m – 1365.33m) / 2	341.35	349525.335
Tenured Space	4096m – 2048m	2048	2097152

So the Tenured Space (Old Generation) cleanup starts at a filling size of 1153433.6 KB (1126.4 MB).

Further average and performance load tests reported that the GC STW activities went down on a maximum of around 10 seconds for most CMS GC STW, but 10 seconds are still too long and STW activities happened much too often.

We analyzed the GC logs which reported Concurrent Mode Failures like the following GC entry.

238276.333: [GC 238276.334: [ParNew: 1709331K->1709331K(1747648K), 0.0001286 secs]238276.334: [CMS238276.640: [CMS-concurrent-mark: 13.637/14.048 secs]

(concurrent mode failure): 1663134K->1619082K(2097152K), 53.1504117 secs] 3372466K->1619082K(3844800K)

Concurrent Mode Failure means that a requested ParNew collection didn’t run because GC perceives that the CMS collector will fail to free up enough memory space in time from Tenured Space. Therefore worst case surviving Young Generation objects couldn’t be promoted to the Tenured Space. Due it this fact, the concurrent mode of CMS is interrupted and a time-consuming Mark-Sweep-Compact Full GC STW is invoked.

On GC log file entry mentioned above the ParNew request happens at JVM Clock Time 238276.333 whereas the Young Generation had a fill level of 1709331 KB (1669.27 MB) out of 1747648 KB (1706,69 MB). This means a filling level of 97.8% out of 1747648 KB (rounded: 1706 MB = 1365 MB Eden Space + 341 MB One Survivor Space). The GC STW happens at Clock Time 238276.334 and took 53.15 seconds. The Tenured Space occupancy dropped from 1663134 KB (1624 MB) to 1619082 KB (1581 MB). This means that 97.5% of all objects survived the Tenured Space clean-up. Only 43 MB of the Tenured Space is getting freed.

So the Young Generation had around 1669 MB before GC which could only free 43 MB on the Old Generation. The Tenured Space seems not big enough to keep all the promoted objects. There is an Oracle recommendation when changing from a Parallel GC scheme to a Concurrent GC scheme. Oracle recommends increasing the Tenured Space by at least 20% to 30% in order to accommodate fragmentation.

We decided to keep the overall JVM Heap size stable and instead to decrease the Young Generation in order to give the Old Generation more space (-Xmn2g to – Xmn1g).

-server –Xms4g –Xmx4g –Xmn1g –Xss512k –XX:PermSize=768m –XX:MaxPermSize=768m –XX:+UseConcMarkSweepGC –XX+UseParNewGC –XX:+CMSParallelRemarkEnabled – XX:+UseCMSInitiatingOccupancyOnly –XX:CMSInitiatingOccupancyFraction=55 –XX:+CMS ClassUnloadingEnabled –XX:ParallelGCThreads=8 –XX:SurvivorRatio=4 –verbose:gc –XX:+PrintGCDetails –XX:+PrintGCTimeStamps – Xloggc:/gc.out

Here is the new calculation on the JVM Heap memory size after changing Young Generation on 1 GB.

Space	Calculation	MB	KB
Eden Space	1024m – ((1024m / (4 +2)) * 2)	682.66	699050.66
Survivor Space (To)	(1024m – 682.66m) / 2	170,67	174766.08
Survivor Space (From)	(1024m – 682.66m) / 2	170,67	174766.08
Tenured Space	4096m – 1024m	3072	3145728

Afterwards we triggered a new Average Load Test. The GC STW activities went down to a maximum of 4.3 seconds, but much more important is the fact that GC STW significantly less frequently happens.

This was good for the Average Load Tests, but a Peak Performance Test was showing an accumulation of CMS GC STW activities during the Performance Test (around Clock Time 200000).

We decided to do a resizing and slightly increased Young Generation (from –Xmn1g to –Xmn1280m) in order to allow objects to be held longer in Young Generation with the hope that they will be collected there and are not promoted to Old Generation. As mentioned by Oracle Doc ID 748473.1, most of the BPEL engine’s objects are short lived. Therefore the Young Generation shouldn’t be too small.

The Survivor Spaces allow the JVM to copy live objects back and forth between the two spaces for up to 15 times to give them a chance to die young. The –XX:MaxTenuringThreshold setting governs how many times the objects are copied between the Survivor Spaces (the default value is 15 for the parallel collector and is 4 for CMS). Afterwards the objects are old enough to be tenured (copied to the Tenured Space). So we increased also the Survivor Spaces (from –XX:SurvivorRatio=4 to –XX:SurvivorRatio=3, see calculation below). Additionally we increase the –XX:CMSInitiatingOccupancyFraction setting on 80% in order to make use of the large Old Generation capacity.

-server –Xms4g –xmx4g –Xmn1280m –Xss512k –XX:PermSize=768m –XX:MaxPermSize=768m –XX:+UseConcMarkSweepGC –XX+UseParNewGC –XX:+CMSParallelRemarkEnabled – XX:+UseCMSInitiatingOccupancyOnly –XX:CMSInitiatingOccupancyFraction=80 –XX:+CMS ClassUnloadingEnabled –XX:ParallelGCThreads=8 –XX:SurvivorRatio=3 –verbose:gc –XX:+PrintGCDetails –XX:+PrintGCTimeStamps – Xloggc:/gc.out

This means the following new JVM Heap sizes:

Space	Calculation	MB	KB
Eden Space	1280m – ((1280m / (3 +2)) * 2)	768	786432
Survivor Space (To)	(1280m – 768m) / 2	256	262144
Survivor Space (From)	(1280m – 768m) / 2	256	262144
Tenured Space	4096m – 1280m	2816	2883584

Now the Old Generation cleanup starts at a filling size of 2252.8 MB. Take a look at the following GC STW diagram which confirms the GC tuning effort. The diagram reports a well-tuned JVM with relatively large number of short Parallel Scavenges with less frequent, but more expensive, full CMS GC.

More GC fine tuning and better GC behavior is for sure possible. But we are quite satisfied with the reached performance results. The SCA Services have much better response times on Average and Peak Performance Tests.

Finally you have to keep a closer eye on the Code Cache of the JVMs running your Weblogic Servers with SOA Suite. It has nothing to do with GC, but here is the explanation why it’s important. As you all know, Java code is getting compiled to bytecode for the JVM. Bytecode has to be converted to native instructions and library calls for the target platform. The interpreted mode always converts bytecode “as it used” which slows down the execution performance. Whereas the Just-In-Time (JIT) compilation keeps preparative compiled code segments (performance-critical “Hotspots”) in a Code Cache, the cached native compiled code is getting reused later without needing to be recompiled (for example in loops). That’s the reason why the JVM over time obtain near-native execution speed after the code has been run a few times.

So it’s performance critical when you have warnings like …

Java HotSpot(TM) Client VM warning: CodeCache is full. Compiler has been disabled.

Java HotSpot(TM) Client VM warning: Try increasing the code cache size using -XXReservedCodeCacheSize=

You can imagine how bad it would be if JIT’ing is switched off because the Code Cache is getting full (default Code Cache size with –server option and 64bit is 48 MB). The Non-Heap Code Cache is allocated during JVM startup and, once allocated, it cannot grow or shrink. Fortunately the Code Cache size could be changed with the -XX:ReservedCodeCacheSize setting. But increasing the Code Cache size will only delay its inevitable overflow. Therefore is more important to avoid the performance breaking interpreted-only mode. The -XX:+UseCodeCacheFlushing setting enables the compiler thread to cycle (optimize, throw away, optimize, throw away), that’s much better than disabled compilation. So if you see that Code Cache warning I would recommend to slightly increase the Code Cache size and to enable Code Cache Flushing. The –XX:+PrintCompilation setting give you more details (or watch the Code Cache behavior on JConsole).

I just want to leave you with three more tips for your own JVM GC tuning. There is a JVM setting -XX:+PrintFlagsFinal which will show you all JVM settings during the startup phase. Second tip is a suppression of programmatic caused Full GC STW by using the System.gc() and Runtime.getRuntime().gc() methods. The -XX:+DisableExplicitGC setting will ignore these method calls which are undermines the GC tuning efforts. Take a look for “Full GC (System)” entries on your GC log files. Third tip is to take a look at the Standard Performance Evaluation Corporation and how they configure the JVM for Weblogic in order to get best performance (move on the result page, look for a Weblogic test on our platform and take a look on the JVM settings at the Notes/Tuning Information section).

Tuesday, August 28, 2012

SOA Suite Performance – Don’t fight against the machine - Part 1

Performance optimization on SOA Suite applications is a challenge! After about one and a half years of performance tuning activities on a large SOA Suite project I’m fully confident to make such an assertion. Why is it so difficult to tune the performance, because the performance is depended on several system architecture layers.

On top of the stack JDeveloper developed applications/services running on the SCA Container. Design and configuration on SCA Composite and SCA Component level have a critical influence on the success of performance optimization activities. Below the SCA Application layer we have the FMW Common Infrastructure (e.g., JRF, OWSM, EM Fusion Control) together with the SOA Suite Service Engines (e.g. BPEL, Mediator) as SCA Runtime environment SOA Infra. There are many performance relevant tweaks you could do on the SOA Infra layer. The Oracle Weblogic Application Server is the foundation on which all Java EE-based Oracle FMW components are build. Classic techniques on JEE performance optimization are used together with the possible utilization of the Coherence Data Grid. JVM optimization is also well known but its performance essential to have the right GC strategy with the right parameterization. Below the JVM we will have the layer with the Operating System which could run on a Hypervisor. Both are highly critical on resource allocation and therefore important on performance tuning activities. As an Application Architect you have to trust the Infrastructure Experts doing their best on performance tuning because in most cases you are leaving your knowledge domain.

Down below you have the machine hardware with network and storage. Here you will find your “physical” limitations. The machine is coming with a certain numbers of CPU/cores, clock speed and the size of the internal memory cache of the system CPU(s). You will have a certain speed and width of the system’s address and data buses. And you will be limited by the amount of memory, the width of its data path and the memory access time which causes the CPU(s) to wait for memory. And you will be highly depended on the speed of the I/O devices such as network and disk controllers.

These hardware factors have the greatest influence on the application performance. For example, Oracle SOA Suite architectures will have CPU-intensive operations happening in a single thread for components like rule evaluation and XML processing. Therefore the CPU clock speed is an important factor to consider. There will come a time during your performance optimization activities when, despite all tuning activities on the upper architecture system layers, you have to add more powerful hardware or upgrade the existing hardware. If you miss the right time to scale up and to scale out, the application architecture will become out of balance on the non-functional system qualities by focusing too much on the performance aspect on the SCA application layer. And because you have the design freedom ugly compromise are make for reaching the right performance. Clearness and comprehensibility of the architecture, extensibility, reusability and maintainability are system qualities which will suffer as a result. System architects are also talking of increasing system entropy (disorder). Furthermore the relationship on costs of the performance tuning activities and the purchase of hardware is at a certain point of time no longer reasonable.

When you start to reduce your SCA Composite modularization on performance reasons, when you realize that your BPEL processes are getting monolithic and you are doing BPEL programming in small, when you don’t mediate on inbound and outbound interfaces because of performance reasons you will have on indicator that it’s time to scale up and to scale out.

Note: Don’t fight for last possible amount of performance on the SCA application layer. You should realize when the right time has come to throw in the towel. Don’t be naïve and believe that all performance problems could be solved on the layers above the machine.

If you have online and batch processing you should also consider to have an own Weblogic Cluster running SOA Suite for online processing and an own Weblogic cluster for batch processing. This is an important consideration because the configuration on the layers will differ between an online and batch related configuration. That’s something we will examine on the next blog when we take a detailed look on the JVM.

Tuesday, September 6, 2011

Oracle SOA Suite: BPELmania (BPEL vs. Spring Beans vs. Mediator)

The current Oracle SOA Suite 11g PS4 (11.1.1.5) offers several SCA Component types for service implementation (BPEL, Human Task, Spring Context, Business Rule and Mediator). All these different SCA Component types raises the question what component to use in which context. Quite clear is the usage scenario of the Human Task and Business Rule Components. But the BPEL, Spring Context and Mediator Components have overlapping functionality. Too often an implementation based on BPEL Components is chosen instead of weighing against the other two available SCA implementation types. This fact is probably explicable form the history of the Oracle SOA Suite where version 10g only offered BPEL. Whereas Spring Context and Mediator Components are introduced starting with 11g and the SCA concept.

Let’s do a short introduction of the three SCA component types and weigh the pros and cons.

BPEL
BPEL (Business Process Execution Language) is an OASIS standardized orchestration and executable language for specifying actions within business processes. BPEL constructs and execute workflows by assembling discrete services into a complete process flow. Beside the standardized basic and structured activities to define BPEL processes, in addition Oracle also defines several extension (proprietary) activity types.

BPEL describes processes on a true business process level (future wise covered by executable BPMN coming with the Oracle BPM Suite on top of the Oracle SOA Suite). And the graphical representation makes processes tangible for business users.

In general I agree with the Wikipedia description that BPEL aims to enable “programming in the large”. But there is the potential danger that inexperienced SOA designer/developer use BPEL components for all kind of service implementation on the Oracle platform (especially the “Java Embedding” activity opens the door on ill-considered implementations). BPEL's control structures such as “if-then-else”, “while” and “repeat until” as well as its variable manipulation facilities together with the Oracle enhancements tends Oracle BPEL to become a “programming in the small” language.

Spring Beans
The Spring Framework is a popular application framework on the Java platform. It aims to reduce the complexity of the programming environment based on the four key concepts of focusing on POJO (Plain Old Java Objects), Dependency Injection (DI), AOP (Aspect-Oriented Programming) and the simplified API Abstraction.

Normal Java Objects (POJO) is at the center of the Spring Framework. POJOs are cross-linked by Dependency Injection (bean dependencies on other objects are not hard-coded, instead object references described as bean properties on a XML context configuration file are injected by setter methods). POJOs will be supplemented in a non-intrusive way with cross-cutting aspects by using AOP (aspects like logging, monitoring, security, transactions). And finally POJOs communicate with other APIs via the simplified API abstraction (like JDBC, JMS, EJB, Quartz, etc.).

The SCA Spring Component Implementation Specification specifies how the Spring Framework can be used as a component implementation technology in SCA. Since Oracle SOA Suite 11g Patch Set 2 it is possible to have Spring Beans as part of the Spring Framework on the Oracle SCA Container as first-class citizen (use Spring Beans like any other SCA Component). Spring Beans makes it possible to run Java implemented services on the Oracle SCA Container.

Compared with BPEL Components the intensive usage of Spring Beans entails risk the other way round. People with a strong JEE or Spring Framework development past will tend to use Spring Beans for all kind of implementations even for the “programming in large” concerns. Especially the possibility to use the matured and out-of-the-box available JCA Technology Adapter is often pushed into the background. Developers really don't have to keeping re-inventing the wheel.

Mediator
As the name suggests the Mediator Component is about intra-composite mediation between various SOA Suite Components. The Mediator brings Enterprise Service Bus (ESB) aspects inside of the SCA Composite. The main task is to route data and sequencing messages between consumers and producers. Defined routing rules could filter, transform and validate data. Additionally the Mediator Component could act as an “echo” service component. Which means that the Mediator is echoing directly back to the initial caller after transformation, validation or enrichment activities. Together with so-called Mediator Java Callout Handler which enables “pure” Java based message manipulation the Mediator starts to compete with BPEL and Spring Beans.

BPELmania – don’t forget the “evil” side of BPEL
It is somewhat similar with the famous Hammer-Nail quote from Abraham Maslow (“If you only have a hammer, you tend to see every problem as a nail.”). For some people BPEL is the hammer with the purpose to implement every service exclusively with BPEL (using BPEL as “glue”, not as programming language). The right approach is to weigh pros and cons of BPEL, Spring Beans and Mediator in a certain usage scenario.
The BPEL main major advantages are the graphical business process visibility, the flexibility for quick process changes and the support for long running business processes.

Business processes can run for several minutes, hours or days (or even longer). With pure Java it’s cumbersome to handle long running processes. It makes the application complex and complicated for the developers to maintain the process.

BPEL also supports compensation in a relatively easy way. Compensation, or undoing non-transactional steps in the business process that have already completed successfully, is one of the most important concepts in business processes. The goal of compensation is to reverse the effects of previous activities that have been carried out as part of a business process that is being abandoned.

Business processes may also have to react on certain events. Such events can be message events or alarm events. Message events are triggered by incoming messages through operation invocation on port types (onMessage Event Processing). Alarm events are time related and are triggered either after a certain duration or at a specific time (onAlarm Event Processing). Oracle BPEL provides good support for managing events in business processes.

Another advantage of BPEL is the portability. BPEL can be executed on other process engines running on different vendor platforms. The prerequisite however is that the BPEL process is only using standardized basic and structured activities.

List of BPEL advantages (on the Oracle SOA platform)

Graphical business process visibility
Flexibility for quick process changes
Support for long running business processes
Business compensation handler (“UNDO”)
Oracle’s good integration of Human Workflow (Human Task Components)
Oracle’s good integration of the Business Rules Engine (Business Rule Components)
Support of events and EDN (Event Delivery Network)
Portability to other standard-based process engines (if obeying the BPEL standards)
Support of synchronous, asynchronous and one-way (asynchronous) process interaction with the caller

It’s obvious that BPEL has been designed to address the requirements of defining business processes. But a BPEL process will cause definitely more load on the application server and the database, because of the BPEL process engine overhead causing a lot of transactions on the SOA/MDS database. It’s possible to minimize the additional load but it’s a complex endeavor. Developers have to deal with dehydration (storing the current status of the BPEL process into the database is known as dehydration), delivery service configuration for incoming messages, audit tracking configuration, the configuration of the completion persist policy, etc.

Additionally Oracle BPEL could handle technical transaction but proper transactional behavior demands deep understanding of the platform behavior (transaction property configuration and dehydration causing activities which causes transaction commits).

Dealing with BPEL variables will require knowledge of XPath and XSLT for transformations. XSLT transformations will cause slowing-down because of Oracle XSLT processor calls. But one of the major weaknesses is the memory consumption caused by XML BPEL variables kept in memory as DOM class (Document Object Model). A DOM class is an in-memory representation of an entire document and the complete infoset state for an XML document. In memory DOM-tree provide good flexibility for developers, but the cost of this flexibility is a potentially large memory footprint and significant processor requirements, as the entire representation of the document must be held in memory as objects for the duration of the document processing (acceptable for small documents , but memory and processor requirements can grow rapidly with document size).

List of BPEL disadvantages (on the Oracle SOA platform)

JDeveloper provides no BPEL debugging features
JDeveloper provides no BPEL refactoring support
BPEL process engine caused additional load (app server, DB)
Complex and non-transparent configuration for performance tuning
Complex and non-transparent configuration of technical transactions
Deep XPath and XSLT knowledge is needed (assign and transformation activities)
XSLT transformation performance overhead (overhead on Oracle XSLT processor calls)
Inapplicable for large XML document processing (because of memory and CPU consumption)
Risk of using BPEL for doing “programming in small”
Oracle extension (proprietary) activity types make processes non-portable
Compared with Spring Beans or Mediator Components, a single BPEL Components offer “only” one service operation (it’s in the nature of BPEL)

Alternatively, in case of a straight forward sequenced process without any control flow, the usage of Spring Beans or Mediator should be considered to achieve higher performance and throughput.

Spring Beans for “programming in the small”
Spring Beans complement the Oracle SCA Components rather than competing on BPEL Components. In fact they are in a strong competition on SLSB EJBs (Stateless Session Bean). Spring Beans and EJBs implement self-contained business services invoked by BPEL. But Spring Beans operate inside of SCA Composites running on the SCA Container whereas EJBs are outside of the SCA Container running on the EJB Container.

The Spring Framework leads to flexibility and modularization of the design because smaller (Spring Beans) components can be easily wired to more complex components (Spring Bean to Spring Bean wiring), following the principles of the SCA composite framework. Spring focuses on “wiring in the small” in traditional applications. On the Oracle SCA Container Spring Beans could be “wired in the small” and “wired in the large”.

EJBs could be used through the SCA EJB Adapter (placed on the external references swim lane), but direct EJB to EJB wiring couldn’t happen on a SCA composite which blocks the architectural modularization concept. Additionally, EJBs could not leverage the Oracle SCA Adapter Framework (for example make use of the DB Adapter or File Adapter) which leads to additional development efforts.

It is even possible to give Spring Bean implemented SCA Services an EJB interface (binding.ejb) in case that services have to interoperate with an EJB application landscape.

One principal reason for the usage of Spring Beans instead of EJBs is SCA’s way to wrap Spring Beans with other SCA Components into one SCA Composite in order to have only one single deployment unit and artifact of versioning (the SCA Service Archive, SAR). This SAR can be versioned as one, deployed as one, tested as one and managed as one instead of worrying about individual pieces. Finally already before the SCA era started Spring Beans become very popular in the Java Community as an alternative and replacement of EJBs (perceived as too heavy-weighted and complex with a lot of unnecessary boilerplate overhead).

An important advantage is the usage of Spring Beans in complex scenarios because Spring Beans could be easily debugged (not possible with BPEL or Mediator Components) and service mocking could happen in a declarative manner (on the context configuration file).

Despite the completeness and maturity of the Oracle SOA platform, projects will run into a situation in which the Oracle SOA Suite cannot provide a solution. Spring Beans are working like a “Swiss army knife”. In other words, Spring Beans can provide the lever to code for patches in the Oracle SOA components. Spring Bean could help on certain scenarios like high performance requirements (on <PS5 make sure you have patch 12402061 or one of the following patches running) or non-working DB Adapter queries (like the range operator BETWEEN, multiple value operator IN on where conditions, dealing with XMLType columns or just very complex SQL commands).

Spring Bean Advantages (on the Oracle SOA platform)

Spring is SCA standardized
Easy debugging and testing (isolated testing and interface mocking)
Spring development is simple-n-short, on-point and easy to read (because of DI, AOP and especially because of the Simplified APIs)
High performance scenarios (“lean and mean”, no overhead like for example dehydration)
Enables easy Java logic reuse on custom-build and standard third-party libraries (packaged and deployed with the SCA Composite)
Flexibility and modularization by “wiring in small” and “wiring in large”
“Swiss army knife” of all missing platform features and necessary workarounds
Concept of “Pre-defined” and “global” Spring Beans (>=PS3) which provides out-of-the-box functionality and global spring context definitions
Compared with BPEL and Mediator, the Spring Bean Component is easy portable
Well integrated on the Oracle SCA environment (leverage SOA Suite build-in features)
Support of synchronous, asynchronous and one-way (asynchronous) Spring Bean interaction with the caller

But because of the all-purpose aspects of Spring Beans there is a risk of non-consideration of BPEL components, Mediator components or already existing JCA Adapter. Using Spring Beans for “programming in large” and especially for long-running process coordination will be the wrong approach. Doing all kinds of “backend access layer” programming and disregard the existence of out-of-the-box JCA Adapter is a misunderstanding of the Oracle SOA platform offering. Developers need to resist the temptation of doing everything with Spring Bean components.

Spring Bean disadvantages (on the Oracle SOA platform)

No Audit Tracking information
Difficult to implement AOP aspects and Springs dynamic language support
Risk of using Spring Beans for doing “programming in large”
Risk of not using JCA Adapter for connectivity

Mediator as “jack-of-all-trades”
The Mediator is a “fallen” ESB (former ESB on the 10g platform), now working as an intra-composite component. This light-weight Service Bus mainly competes with the OSB, the Oracle Service Bus (former AquaLogic Service Bus on the BEA platform). The Mediator could be used as a “composite proxy” behind the exposed service interfaces. In reverse, the Mediator could be used as an “external reference proxy” for any external service invoked by a composite. But virtualization of services is also a main task of the OSB. Antony Reynolds and Matt Wright describe three basic approaches on the usage of Mediator and OSB in inter-composite application communication.

Centralized topology (go always via OSB)
Peer-to-Peer topology (go directly via the service infrastructure- called Oracle Fabric)
Hybrid topology (combine both upper approaches)

It’s almost impossible to recommend only one approach. It always depends on the requirements in a given project. All approaches have pros and cons but the Peer-to-Peer topology has same key advantages.

All invocations are optimized, resulting in significant improved performance
Transactions are propagated in composite-to-composite calling scenarios
Complete audit trails across the whole composite calling chain

Finally the OSB is adding an additional layer of complexity in combination with additional communication overhead. OSB could be used for example handling service security aspects, service throttling and service result set caching (internal implemented by Coherence). But it certainly makes sense to use the OSB in an enterprise application (for example ERP systems) calling scenario. These are applications in their own right. Little sense makes the usage of the OSB in a composite-to-composite calling scenario where one composite is calling a “private” composite for service implementation (running on the same SCA Container). Private composite means an internal used composite implementing an internal re-used entity or utility service.

Mediator is powerful and could be used on several scenarios were designer/developer primarily think of using BPEL or Spring Beans. First of all the Mediator could replace efficiently BPEL Components in a simple straight forward orchestration use case. If simple service invocations have to happen in sequential or parallel the Mediator might be an alternative. Especially in “low” level invocations on the JCA Adapters like the Database Adapter (for example to implement an Entity Service).

The Mediator could execute the calls in sequence or in parallel (for example Insert DML commands on the DB Adapter). Whilst implementing the mediation enrichment pattern, the reply from one target service can be forwarded to another target service without going back to the original caller (Lucas Jellema calls it “small-time enrichment”).

Beside the “normal” XML payload validation, the Mediator could also do more complex Schematron validations (for doing cross XML document validation). The Mediator implements the feature of echo services. This means the Mediator is directly going back to the requester without the need to route to another target service. The echo feature of the Mediator provides the opportunity to implement utility services for complex transformation or validation. Even more interesting is the Java Callout Handler, which can act as a kind of Java backdoor capability. Java callouts enable you to use external Java classes to manipulate messages flowing through the Oracle Mediator. At this point the Mediator strongly overlaps with Spring Bean functionality. For example the Mediator Component could implement a complex Java based transformation or validation as echo service (complex validation implementation overlaps on the other side with the Business Rules Component).

Finally the Mediator is absolutely necessary in a SCA Composite design where several exposed service operations are implemented by BPEL Components. A BPEL Component could only offer one service interface.

Mediator advantages (on the Oracle SOA platform)

Resequence messages capability
Support of EDN (Event Delivery Network)
Mediator Asynchronous Messaging (synchronous to asynchronous)
Mediator supports Domain Value Maps
(DVM helps significantly when utilizing Canonical Data Models)
Mediator provides the VETRO capabilities intra-composite
(VETRO stands for Validate, Enrich, Transform, Route, Operate)
Echo Service feature (independent service implementation)
Java Callout Handler (door-opener to the powerful Java world)

The Oracle Mediator is an Oracle proprietary SCA Component Implementation Type. Portability of BPEL (without using Oracle extension activities) and Spring Beans to a new SOA platform could be reached easily. There is no SCA standard on Mediator Components therefore it would require large rework on a platform move.

Sometimes it’s necessary to handle SOAP Faults on Mediator Java Callouts. Without patching the system it’s not possible to handle proper SOAP Faults (<=PS4). Designer and Developer should be also aware that the parallel execution type will cause a new transaction. This fact is (hidden) on the documentation but could cause long trouble-shooting sessions.

Mediator disadvantages (on the Oracle SOA platform)

Mediator is a highly Oracle proprietary component
Mediator SOAP Fault Handling on Java Callouts (only with patch possible)
Mediator parallel execution type initiate (in-transparent) a new transaction
Deep XPath and XSLT knowledge is needed (assign and transformation)
XSLT transformation performance overhead (overhead on Oracle XSLT processor calls)

Summary
Use the Oracle SCA Components as originally conceived. BPEL is not the answer on all design/implementation scenarios. Ugly “back door” Java implementation through BPEL Java Embedded Activities should be buried in the past. Future Java based service development on the SCA Container is not a matter of SLSB (Stateless Session Bean) EJB development. These concerns are clearly pointing on the usage of Spring Beans. The Mediator component is a light-weight intra-composite ESB, but in selected use cases also a valid alternative on BPEL or Spring Bean development.

On an architecture layer diagram the BPEL Component is placed on the Business Process Layer. A BPEL Component could call multiple Sub-BPEL processes on the Business Process Layer. BPEL processes shouldn’t implement business services, they should orchestrate them. New Business Services on the SCA Container should be implemented by Spring Beans or Mediator Components as Echo Service (for sure, Human Task and Business Rules Components are also excellent candidates to implement business services). Reuse of existing Business Services implemented by SLSB EJB is possible via EJB Adapter or direct calls from Spring Beans. The Mediator Component is used for intra-composite and composite-to-composite (using the peer-to-peer topology) routing scenarios.

The diagram doesn’t contain Human Task and Business Rules Components. Also the OSB is left out in order to reduce complexity. Layer bridging by skipping the Business Process Layer is a real-world scenario.

The Oracle SOA Suite offers a rich construction kit on SCA Components and several ways to use these components. It’s the task of the architects, designer and developers to use these components in the right way.