Friday, December 11, 2009

Deciding to use DomainHealth

DomainHealth (DH) is an open source "zero-config" monitoring tool for WebLogic. It collects important server metrics over time, archives these into CSV files and provides a simple web interface for viewing graphs of current and historical statistics.

So when should you consider using DomainHealth?


When you don't have a full monitoring solution in place already for your WebLogic environment. Ideally an organisation will have standardised on using Oracle Enterprise Manager or a 3rd party management tool, for its comprehensive monitoring needs. Oracle Enterprise Manager also caters for various management requirements in addition to monitoring, which DH does not. The scope of DH's monitoring capabilities is purely focussed on tracking current and historic usage of certain key server resources. DH is not applicable for profiling your deployed applications (instead see Oracle AD4J) or intensively monitoring your JVM (instead see Oracle JRockit Mission Control). However, some organisations using Enterprise Manager may find that DH acts as a convenient and complementary addition to their monitoring solution.

The most comparable tool to DomainHealth is the WebLogic Diagnostic Framework Console Extension which ships as standard with WebLogic. I use both of these tools in my day to day work, where different situations and requirements dictate the use of one over the other.

I use DomainHealth (with its built-in WLDF harvesting mechanism) rather than the WLDF Console Extension, in situations where some of the following factors are advantageous:
  1. Zero-configuration. An administrator does not have to first work out and configure the server objects to monitor. An administrator does not have to spend time creating a complex WLDF module with the correct object types, names and attributes in it. An administrator does not have to work out what graphs to plot and then configure specific graphs for each server in the domain for every new environment (eg. Test, Pre-Prod, Prod1, Prod2).
  2. Minimal performance impact on Managed Servers. Obtains a set of statistics once and ONLY once, regardless of how many times you come back to view the same statistics in the graphs. The background statistics collection work is driven from the admin server, once per minute, lasting far less than a second.
  3. Tools friendly storage of statistics in CSV files. Administrators can open the CSVs in MS Excel or Open Office for off-line analysis and graphing. Using CSV files rather than WebLogic Persistent File Stores on the admin server has no detrimental performance impact. It doesn't matter if it takes 10 microseconds or 100 milliseconds to persist the set of statistics - timeliness only has to be to the nearest minute. The file I/O for writing data to CSV files on the admin server is not in the 'flight-path' of transactions that happen to be racing through the managed servers.
  4. Minimal administrator workstation pre-requisites. Doesn't require Java Applet support on the administrator's workstation; it's browser-friendly and just uses very simple HTML and PNG images to display graphs.
  5. Hot deployable. Deployable to an already running domain for diagnosis of currently occurring problems, without needing to restart the admin server.
  6. Statistics don't constantly scroll whilst trying to analyse them. Administrators can focus in on the current window of time or an historic window of time, in a graph, without it continuously refreshing and moving to a later time. A simple set of navigation buttons is provided to move backwards or forwards in time or just go to the most current time.
  7. Statistics can be viewed for non-running Managed Servers. If a managed server has has just died, graphs of its recent statistics can still be viewed to help diagnose the failure cause, without first requiring the managed server to be recovered and re-started.
I use the WLDF Console Extension rather than DomainHealth when some of the following factors are advantageous:
  1. Infinitely configurable. Administrators get to choose exactly what server resources they want to monitor.
  2. Fine-grained statistics capture. Statistics are gathered and displayed at a much higher frequency than just once every minute.
  3. Shipped as part of WebLogic. No need for an administrator to seek corporate approval to download and provision a 3rd party open source application into the organisation's WebLogic environment.
  4. Statistics can be retrieved for the periods of time when the Admin Server was down. As long as an administrator has previously configured a WLDF module with the right harvested objects and attributes, statistics can still be retrieved retrospectively, by the Console's graphs, following a period of time when the admin server was down and unable to contact the managed servers.
So if DomainHealth sounds like it would be useful, give it a try and let me know your feedback in the forums provided on the project's site (you have to first click the Develop menu option for some reason that only SourceForge knows!).

DomainHealth project home page: http://sourceforge.net/projects/domainhealth

DomainHealth help documentation: http://sourceforge.net/apps/mediawiki/domainhealth


Song for today: Gimme Shelter by The Rolling Stones

Thursday, December 10, 2009

New DomainHealth WebLogic monitoring tool version

I've just released the latest version of DomainHealth - version 0.8 (well actually 0.8.1 because of a fix for a last minute bug spotted by Kris).

DomainHealth (DH) is an open source "zero-config" monitoring tool for WebLogic. It collects important server metrics over time, archives these into CSV files and provides a simple web interface for viewing graphs of current and historical statistics.

You can download it from the project home page at http://sourceforge.net/projects/domainhealth.

The help docs for DomainHealth are at: http://sourceforge.net/apps/mediawiki/domainhealth.


This release includes many minor fixes and enhancements (see the Release Notes document listed alongside the DH download file), plus the following major additions:
  • Now provides the ability to harvest and retrieve server statistics using a WLDF module (configured on the fly by DH), rather than using JMX to poll each server for statistics. This is now the default behaviour when running on WLS version 10.3.x or greater. For WLS versions 9.0 to 10.0.x, it still uses JMX Polling. If you prefer to use JMX Polling for the recent WLS versions, you can force this behaviour with a special parameter (see the Help docs). It is worth noting that, although I don't believe the load that the periodic JMX Polling puts on the managed servers (once a minute), is noticeable, I was still keen for DH to move to use WLDF by default. This way, DH acts as a WebLogic 'good citizen' and is also able to better cope with the increased number of MBean statistics that inevitably come with each new DH release.
  • Now shows a lot more interesting Thread Pool statistics on the main page (including Throughput and QueueLength).
  • Previously, for domains with many servers, it was difficult to drill into the statistics for just one specific server at a time, in the graphical web pages. Now you have the option to select which server you want to see on the web page, in isolation, by selecting the server's name from a drop down list.
  • When using the WLDF based mechanism for collecting metrics, statistics for all Work Managers and all Server Channels (protocol server sockets) are now also retrieved and displayed. I have not added this capability for the JMX Polling based mechanism because I'm wary of putting too much load on each managed server during the polling process (I may revisit this decision at a later date).

Song for today: Touched by VAST

Saturday, October 24, 2009

New WebLogic book is out

A new book titled Professional Oracle WebLogic Server (Wiley/Wrox, 2009) is out now.

POWLS Book Front Cover
I am a contributing author having written the Web Services chapter (ch.9). A colleague of mine, Josh, wrote the Security chapter and the main bulk of the book was written by Robert, Phil and Greg, with Robert pulling the whole thing together into what we hope is a single cohesive and instructive resource (Robert and Phil are colleagues of mine too).

It's an intermediate to advanced-level, best practices style book aimed at WebLogic Server version 11g (10.3.1). However, for the most part, the content is equally applicable to earlier versions of WebLogic too, from 9.0 onwards, with the notable exception of its EJB 3.0 and JAX-WS 2.1 coverage.

You can get the book from the main book home page, where you can also download the Table of Contents in PDF form, it you first want to better understand what topics are covered. This home page also includes a link for you to download the book's sample code which includes the Web Services example projects, that I wrote to accompany chapter 9.

You can also get the book from Amazon UK, Amazon US and most other major book retailers out there. Parts of the book can be previewed at Google Books.

Enjoy!


Song for today: I've Made Enough Friends by The Wrens

Thursday, September 3, 2009

Attempting to Quickly Diagnose Production Memory Leaks with JRockit

I just wanted to mention my favourite, not well known, JRockit diagnostic command which I've used a few times recently to resolve tricky JVM heap related issues in production environments.

Imagine the situation: You've performance tested and tuned your server and application to the hilt, but on day 3 after going live, the production servers are exhibiting excessive heap memory usage issues. You've been pushed into the noisy server room, the pressure's on and the project sponsor is pacing up and down behind you. Until a sys-admin sorts out a non-headless machine for you to run JRockit's Mission Control on, you're stuck with the command line on the production server you've just SSH'd into. You're desperate to be able to just issue a command or two into the shell to quickly see what's happening in the JVM heap before the sys-admin bounces the server to prevent it from going pop with out of memory errors. Well there is such a command that might just save you....

Using JRockit's JRCMD command line tool you can issue a command called heap_diagnostics.

This command was new in JRockit R26.4 which was bundled with some of the later WLS 9.x versions and with all WLS versions from 10.0 onwards. It produces a diagnostic text dump of the state of the live JRockit heap, into sys-out of your command shell. The dump includes a summary of the current heap size and heap free, the layout of the native memory used by the JVM, the layout of the JVM's heap, and the state of soft/weak/phantom references and finalizers at the last garbage collection. Most crucially though, the dump also includes the number, percentage and size of every object type currently held in the heap, in order of the largest consumer first. If you suspect a memory leak and look at the names of the classes listed at the top of this list, you have a good chance of very quickly discovering which objects are being leaked. This may not directly point you at the root cause, but it will give you a much better idea of where the problem may lie.

As an example, to produce a diagnostic dump on a Unix based host environment, run the following commands to: (i) make the JRockit JRE and tools visible to your shell, (ii) determine the operating system's process ID of the WebLogic instance you want to probe, and (iii) generate the diagnostic dump into a text file:
$ . /opt/oracle/WLS103/wlserver_10.3/server/bin/setWLSEnv.sh

$ jps -l | grep weblogic
9681 weblogic.Server

$ jrcmd 9681 heap_diagnostics > heapdump.txt

The command takes a second or two to complete (depending on the current size of your heap and how much is in there). In the environment where I was chasing a memory leak, the dump text file was about 0.5 MB in size, which I've truncated and shown below.

In this specific example, by looking at the first few entries in the "Detailed Heap Statistics" section of the dump, showing largest heap consumers first, I was able to take an educated guess that the cause of my memory leak was due to some SNMP related classes in the package 'monfox' in WebLogic 10.0.1 rather than in my application code. In my case, having so many of these types of objects in the heap was not to be expected. However in other cases, top consuming objects may be legitimate and not be leaked objects, so this is only a potential indicator. Incidentally, if you need a patch for this specific SNMP issue, raise an Oracle Support case using Metalink refering to Bug#8185278.
======== BEGIN OF HEAPDIAGNOSTIC =========================

Total memory in system: 3434975232 bytes
Available physical memory in system: 2001797120 bytes
-Xmx (maximal heap size) is 536870912 bytes
Heapsize: 536870912 bytes
Free heap-memory: 470024072 bytes

mmStartCompaction = 0x8c00000, mmEndCompaction = 0xac00000

Memory layout:
00000000-00002000 ---p 00000000 00:00 0
08048000-08057000 r-xp 00000000 08:06 11421468   /opt/oracle/wls1001/jrockit_150_11/bin/java
08057000-08058000 rwxp 0000f000 08:06 11421468   /opt/oracle/wls1001/jrockit_150_11/bin/java
08060000-080e0000 rwxp 08060000 00:00 0
08100000-08901000 rwxp 08100000 00:00 0
.........truncated.........

--------- Detailed Heap Statistics: ---------
26.1% 12083k   108775   -541k [C
12.8% 5924k     5586    +18k  [B
 7.2% 3338k   142444    -52k  java/lang/String
 6.4% 2974k    54389     -4k  monfox/toolkit/snmp/util/OidTree$Node
 3.8% 1753k    16028     +0k  [Lmonfox/toolkit/snmp/util/OidTree$Node;
 2.6% 1218k    38978    -39k  monfox/toolkit/snmp/agent/ext/table/SnmpMibTableAdaptor$AdaptorMibNode
 1.2% 555k     7069     +0k   java/lang/Class
 0.5% 225k     3213     -2k   java/util/TreeMap$Entry
.........truncated.........
     46258kB total ---
--------- End of Detailed Heap Statistics ---

------------------- Printing heap ---------------------
"o"/"p" = 1k normal/pinned objects
"O"/"P" = 50k normal/pinned objects
"."/"/" = <1k br="" fragmentation="" normal="" objects="" pinned="">" "/"_" = 1k/50k free space
-------------------------------------------------------
OOOOOOOOOOOOOOOOooooooooooooooooooo.oooooooooooo..      0x8ccb5e8
..o.oooooooooooooooooo.O.............OOooooooooooo      0x8cfb770
ooooooooooo.Oooooooooooooooooooooooooooooooooooooo      0x8d13a28
.........truncated.........
__________________________________________________      0x28a36420
____________________________________                    0x28bfd178
         
-------------- Done printing heap ---------------------

--- Verbose reference objects statistics - old collection --------
456.7 MB free memory (of 512.0 MB) after last heap GC, finished 0.137 s ago.
Soft references: 326 (146 only soft reachable, 0 cleared this GC)
    java/lang/ref/SoftReference: 215 (140, 0)
           71 ( 32, 0) java/lang/reflect/Method
           45 ( 45, 0) [Ljava/lang/reflect/Method;
.........truncated.........
    Softly reachable referents not used for at least 228.349 s cleared.
Weak references: 15268 (0 cleared this GC)
    java/lang/ref/WeakReference: 12834 (0)
        11355 (  0)    java/lang/Class
          888 (  0)    com/sun/jmx/interceptor/DefaultMBeanServerInterceptor$ListenerWrapper
.........truncated.........
Final handles: 743 (0 pending finalization, 0 became pending this GC)
          244 (  0, 0) java/util/zip/Inflater
          228 (  0, 0) java/util/jar/JarFile
           48 (  0, 0) com/rsa/jsafe/JA_RSAPublicKey
.........truncated.........
Weak object handles: 42096 (0 cleared this GC)
        26308 (  0)    java/lang/String
        12262 (  0)    sun/misc/Launcher$AppClassLoader
.........truncated.........
Phantom references: 366 (330 only phantom reachable, 0 became phantom reachable this GC)
    com/bea/xbean/store/Locale$Ref: 330 (330, 0)
          330 (330, 0) com/bea/xbean/store/Cursor
    jrockit/vm/ObjectMonitor: 22 (0, 0)
            4 (  0, 0) weblogic/work/ExecuteThread
            3 (  0, 0) weblogic/kernel/ServerExecuteThread
.........truncated.........
--- End of reference objects statistics - old collection ---------

Dark matter: 7207024 bytes
Heap size is not locked

======== END OF HEAPDIAGNOSTIC ===========================


Song for today: Turn Around by Whiskeytown

Sunday, June 14, 2009

Truly Dynamic Web Services on WebLogic using JAX-WS

In my last blog (a long time ago I know) I talked about the potential for using JAX-WS on WebLogic 10.3 to fulfil the 8 key values that I believe are important for Web Service based Messaging. I've been using JAX-WS in WebLogic a lot over the last half a year and I've been very impressed with its power. As promised, I'm now (finally) going to describe the results of my experiments, with JAX-WS Providers/Dispatchers specifically, to see if they can achieve 8 out of 8 on my scorecard. Does JAX-WS give me the power and freedom I desire when doing web services?

For the experiment, I have used a Customer based web service example, like I've used before. Using an Interface-First approach, I created a WSDL and associated XSD Schema for a fictitious "Change Customer Information" HTTP SOAP web service. Then I created the Java Web Service (JWS) implementing a JAX-WS Provider interface, rather than using a strongly-typed JAX-WS Service Endpoint Interface (SEI). To test out the integration between the WebLogic's JAX-WS runtime and WebLogic's Security sub-system, I included a WS-Policy in my WSDL to force service consumer applications to provide a WS-Security UserToken (username/password) in the SOAP header for authentication.

My code for the Provider is shown below:
@WebServiceProvider(
  serviceName="CustomerInfoChangeService",
  targetNamespace="testns:custinfochngsvc",
  portName="CustomerInfoChangeServiceSOAP",
  wsdlLocation="/WEB-INF/wsdls/CustomerInfoChangeService.wsdl"
)
@ServiceMode(value=Service.Mode.PAYLOAD)
public class CustomerInfoChangeService implements Provider<Source> {
  private final static String XSD_NMSPC = "testns:custinfo";
  @Resource
  private WebServiceContext context;
 
  public Source invoke(Source request) {  
    try {
      // Check some security stuff
      System.out.println("Service invoked with principal: " + 
                                       context.getUserPrincipal().getName());
      System.out.println("Is in Administrators group? " + 
                                     context.isUserInRole("Administrators"));
   
      // Process request
      Transformer transformer = TransformerFactory.newInstance().newTransformer();
      StreamResult sysOutStream = new StreamResult(System.out);
      System.out.println("Changing following customer info:");
      transformer.transform(request, sysOutStream);
      System.out.println();
   
      // INVOKE REAL XML PROCESSING AND CUSTOMER DB UPDATE LOGIC HERE
   
      // Construct response
      String responseText = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
        "<CustomerInfoChangeAcknowledgement xmlns=\"" + XSD_NMSPC + "\">" +
        "  <Ack>SUCCESS</Ack>" +
        "  <Comment>Successfully processed change for customer</Comment>" +
        "</CustomerInfoChangeAcknowledgement>";
      return new StreamSource(new StringReader(responseText));
    } catch (Exception e) {
      throw new WebServiceException("Error in Provider JAX-WS service", e);
    }
  }  
}

I bundled this Provider JWS into a plain WAR archive and deployed to WebLogic. I then created the service consumer code as a standalone Java application. The code for this client, which uses the JAX-WS Dispatch API to locate and invoke the remote Web Service, is shown here:
public class CustomerInfoChangeClient {
  private final static String WSDL_URL_SUFFIX = "?WSDL";
  private final static String WSDL_NMSP = "testns:custinfochngsvc";
  private final static String WSDL_SRVC_PORT = "CustomerInfoChangeServiceSOAP";
  private final static String WSDL_SRVC_NAME = "CustomerInfoChangeService";
  private final static String XSD_NMSP = "testns:custinfo";
  private final static String USERNAME = "weblogic";
  private final static String PASSWORD = "weblogic";
 
  public static void main(String[] args) {  
    if (args.length <= 0) {
      throw new IllegalArgumentException("Must provide endpoint URL arg");
    }

    try {
      new CustomerInfoChangeClient(args[0]);
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  public CustomerInfoChangeClient(String endpointURL) throws IOException, 
        SAXException, TransformerException, ParserConfigurationException {
    // Construct request
    String requestText = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
                         "<CustomerInfo xmlns=\"" + XSD_NMSP + "\">" +
                         "  <Id>1</Id>" +
                         "  <Address1>6</Address1>" +
                         "  <PostCode>AB1 2YZ</PostCode>" +
                         "</CustomerInfo>";
    Source request = new StreamSource(new StringReader(requestText));

    // Invoke service operation, adding appropriate credentials
    Service service = Service.create(new URL(endpointURL + WSDL_URL_SUFFIX),
                                      new QName(WSDL_NMSP, WSDL_SRVC_NAME));
    Dispatch<Source> dispatcher = service.createDispatch(new QName(
            WSDL_NMSP, WSDL_SRVC_PORT), Source.class, Service.Mode.PAYLOAD);
    Map<Object, String> rc = ((BindingProvider) dispatcher).
                                                     getRequestContext();
    List<CredentialProvider> credProviders = new 
                                          ArrayList<CredentialProvider>();
    credProviders.add(new ClientUNTCredentialProvider(USERNAME.getBytes(),
                                                    PASSWORD.getBytes()));
    rc.put(WSSecurityContext.CREDENTIAL_PROVIDER_LIST, credProviders);
    //rc.put(BindingProvider.USERNAME_PROPERTY, USERNAME);
    //rc.put(BindingProvider.PASSWORD_PROPERTY, PASSWORD);
    Source response = dispatcher.invoke(request);

    // Process response  
    Transformer transformer = TransformerFactory.newInstance().
                                                   newTransformer();
    StreamResult sysOutStream = new StreamResult(System.out);
    System.out.println("Change customer service result:");
    transformer.transform(response, sysOutStream);
    System.out.println();
  }
}

Notice that I have not used any generated XML-to-Java classes for the parameters and return values from within the client or in the service itself. Once compiled and run, the client application works completely as expected, and successfully authenticates with the remote web service using the specified username/password WS-Security credentials.

Here are my observations from the experiment:
  • In the simple example I just used StreamSource for processing the XML request and responses at both ends. However, I could have used other APIs such as DOMSource, JAXBSource, SAXSource, StAXSource from the Java 1.6 standard 'javax.xml.transform' packages instead (and even mixed and matched these for the request and the response).
  • I have used javax.xml.transform.Source for accessing the SOAP request and response messages, instead of javax.xml.soap.SOAPMessage. With Source I can only access the contents of the SOAP Body (refered to as the 'PAYLOAD'). With SOAPMessage, I can choose to access the XML elements of the SOAP Envelope (including SOAP headers) as well (refered to as the 'MESSAGE'). However using SOAPMessage restricts me to only being able to process XML using the W3C DOM API, therefore using Source appeals to me more. When using Source, if I need to access SOAP Headers, I'd probably just use JAX-WS Protocol Handlers anyway, to process these headers before or after my main Provider class is called.
  • I don't need to use WebLogic's WSDLC, JWSC or ClientGen Ant tasks because no "build-time" generation of Java classes (using JAXB) or stubs/skeletons is required. For my service, I could have still optionally used JWSC to generate a deployable WAR file with the appropriate web.xml deployment descriptor auto-generated, but I chose to create these artefacts myself, in a way that I can control, using simple Ant build.xml tasks.
  • To ensure that my JAX-WS service is detected properly by the WebLogic runtime, the key thing I needed to do was create a Servlet definition and mapping entry in my web.xml deployment descriptor for my JWS Provider class (even though a provider class does not actually implement javax.servlet.Servlet). During deployment, the WebLogic Web Service Container automatically detects that the web.xml declared servlet is infact a JWS class and automatically maps the URL specified to an internal WebLogic JAX-WS handling Servlet called "weblogic.wsee.jaxws.JAXWSWebAppServlet". As a result, my transport-protocol agnostic SOAP JWS class is now exposed over HTTP. My web.xml file includes the following:
   <servlet>
      <servlet-name>CustomerInfoChangeService</servlet-name>
      <servlet-class>test.service.CustomerInfoChangeService</servlet-class>
   </servlet>
   <servlet-mapping>
      <servlet-name>CustomerInfoChangeService</servlet-name>
      <url-pattern>/CustomerInfoChangeService</url-pattern>
   </servlet-mapping> 
[UPDATE 19-Jun-09: Actually after some further testing, I discovered that if I omit the servlet definition/mapping from my web.xml, the JWS Provider web-app still deploys and runs correctly. WebLogic automatically maps the Provider it discovers on deployment, to a URL which is the Provider class's name (without package name prefix and without .class suffix). This is in-line with the JavaEE 1.5 - Servlet 2.5 spec, where deployment descriptors are now optional. However, it may still be desirable to create a servlet definition/mapping in web.xml to be able to better control the URL of the service).]
  • I would have ideally liked to have included a @RolesAllowed() annotation in my Provider class to help declaratively restrict access to service operations based on role. However, the JAX-WS specification doesn't currently cater for this annotation and WebLogic's JAX-WS implementation does not process this annotation, if its included.
  • In addition to testing authentication using message-level security (eg. a WS-Security user token) I also tested authentication via transport-level security using a client provided HTTP Basic Authentication token (see commented out lines of code in the client app above). In both cases, authentication worked properly, and in my Provider class, when I call context.getUserPrincipal() the proper authenticated principal user object is returned.
  • However, when using context.isUserInRole("Administrators")) in my Provider class, "true" is only returned when using transport-level security, but not when using message-level security. This means that for message-level security, performing programmatic access control is currently limited to checking the user principal object only - the user's roles can't be queried. Here was the security role I defined in web.xml:
    <security-role>
       <role-name>Administrators</role-name>
    </security-role>
...and here is how I mapped the role to my WebLogic EmbeddedLdap 'Administrators' group in weblogic.xml:
    <security-role-assignment>
       <role-name>Administrators</role-name>
       <principal-name>Administrators</principal-name>
    </security-role-assignment>

So to wrap up, how does JAX-WS in WebLogic 10.3 stack up against my 8 criteria?

Well pretty well actually. I would say 7.75 out of 8. A full score on the first 7 criteria in the list and on the 8th (integration with the host container's Security Framework), just a 1/4 point dropped. This is due to a lack of full flexibility in defining declarative or programmatic access control for a service, when using message-level authentication.


Song for today: Festival by Sigur Rós