Stomp

Implementation details for advanced developers.

 

How does Stomp work?  Stomp is the combination of three separate modules.  They are:

 

-         A framework for handling and simplifying transaction management in a JDO environment. This framework deals with multi-threaded applications and answers important questions about data caching, performance, and distributed objects.

-         A framework for adding method-level services to objects in a somewhat transparent way.  The most important of these services is a transaction layer, designed to abstract away many of the details of using the transaction, object location framework above.

-         A bytecode enhancer built with the Serp bytecode toolkit, which takes normal objects and makes them transparently use the two frameworks above.

 

The best way to understand how Stomp works is to first understand how you would use the first two frameworks if the enhancer weren’t available.  Let’s start with the most fundamental one.  Before we get into that, you will need to become familiar with JDO.  Kodo JDO is an excellent implementation.  Even if you choose another vendor, the examples available in their evaluation download will help you understand what JDO is and how it is used, and are a great way to get started.  (Note that it is not my goal to push any particular JDO vendor, and Stomp has been designed to make it easy to plug-in other JDO implementations.)

 

After this point, I’ll assume you have a working understanding of JDO.  This includes how to write metadata to make your objects persistent, how to use the JDO Query language to fetch objects from the database, etc.  You don’t need to be an expert, but you should probably have successfully put an object in a database, and pulled it back out in a different JVM, using JDO, before you continue.

 

How does Stomp help me use JDO?  OK, so you’ve read about JDO, you’re excited about the possibility of transparent persistence, and you want to get started.  There are several important questions you’ll have to answer before you proceed.  These are:

 

-         What’s the deal with this PersistenceManager concept?  When should I reuse an existing PersistenceManager, and when should I use a new one?

-         How am I going to deal with transactions?  In particular, how can I allow multiple transactions in multiple threads, and how am I going to enroll different objects in different method calls in the same transaction?

-         How am I going to cache database data?  How do I walk the performance tightrope between too many calls to the database, and using cached data that is out of sync?

-         How am I going to keep object data for things in different JVMs in sync?

 

First, let’s talk about using PersistenceManagers.  Stomp’s suggestion is to use a combination of PM’s.  First, a single, “read-only” PersistenceManager provides access to only read-only objects.  This PersistenceManager never becomes involved in transactions, and any developer attempts to start a transaction, or to change the persistent fields of objects found with this PM, will result in an Exception being thrown.  This PM develops a large supply of cached data over time, and is a great performance booster for applications which are “read-mostly”, where the number of transactions is not very high.  This read-only PM, like all PersistenceManagers in Stomp, is made available to the applications classes via the JDOFactory, in particular, the getReadOnlyPersistenceManager () method.  This is the primary mechanism for discovering what the current state of objects in the database is.  More on how this information is maintained (efficiently) later.

 

When any object wants to change persistent data, a transactional version of that object must be found in a “transactional” PersistenceManager.  A transactional PersistenceManager is one that has been obtained from the JDOFactory via a getTransactionalPersistenceManager () call, and they are given to the client with a Transaction already active.  The client simply finds their read-only object in the transactional PM ( via a JDOFactory.findInPm ( Object, PM ) call, or by requesting the object by id directly from the PM ), and then alters or calls mutator methods on that object.  When the client is done making persistent changes, closing the PM results in the transaction being closed.  The framework is responsible for returning the same transactional PM to all requests in same thread, as long as a transaction is open, and for not committing the transaction until all clients have closed their PersistenceManagers.  For example, let’s consider these two persistent classes.  When a person has free time, she pets her Pet.  The pet responds by increasing the person’s karma:

 

public class Pet extends AbstractPersistentObject {

    private int _petCount = 0;

    public void pet ( Person petter ) {

        PersistenceManager txPm = JDOFactory.singleton ().getTransactionalPersistenceManager ();

        try {

            Pet me = (Pet) JDOFactory.singleton ().findInPm ( this, txPm );

            me._petCount++;

        } finally {

            txPm.close ();

        }

        

        petter.increaseKarma ();

    }

 

    public Object findInPm ( PersistenceManager pm ) { return this; }

}

 

public class Person extends AbstractPersistentObject {

   private int _karma = 0;

   private Pet _myFriend;

 

   public void increaseKarma () {

        PersistenceManager txPm = JDOFactory.singleton ().getTransactionalPersistenceManager ();

        try {

            Person me = (Person) JDOFactory.singleton ().findInPm ( this, txPm );

me._karma++;

        } finally {

            txPm.close ();

        }

   }

 

   public void freeTime () {

        PersistenceManager txPm = JDOFactory.singleton ().getTransactionalPersistenceManager ();

        try {

            Pet myPet = (Pet) JDOFactory.singleton ().findInPm ( _myFriend, txPm );

myPet.pet ( this );

        } finally {

            txPm.close ();

        }

   }

 

    public Object findInPm ( PersistenceManager pm ) {

            _myFriend = (Pet) JDOFactory.singleton ().findInPm ( _myFriend, pm );

            return this;

    }

}

 

In this example, calling ‘person.freeTime ()’ results in three different nested calls to JDOFactory.getTransactionalPersistenceManager, but since all the calls occur in the same Thread, all three transactional PMs will be part of the same transaction.  In fact, they will be the exact same PM.  So if Pet decides to rollback the transaction, the transaction started in Person will be rolled back as well.  The author of the Pet class does not need to know, and does not care, whether or not a transaction is active at the time ‘pet (Person)’ is called, and it is critical to code reuse that this developer not need to know the details of the enclosing transaction.  This idea is a major advantage found in the EJB spec, in which transaction settings may be declared for objects in metadata, and objects are automatically enrolled in running transactions, and has been applied to Stomp.  (Of course, in EJBs you would not write any transaction logic in your source code at all… this is what the full Stomp enhancer offers.  More on that later.)

 

For those cases where this behavior is not desired and the author really wants to ensure that pm.close () results in a transaction commit, the JDOFactory.newTransaction () method is provided.  Note that many transactions may be going on simultaneously, either in different threads or as the result of a new transaction call, and they will be separated because the objects enrolled in the different transactions all come from different PersistenceManagers.  Any reads on the read-only object will return the data as it stands in the datastore, independent of anything going on in transactions at that time.

 

OK, so you commit a transaction… what happens then?   As with many of the details of Stomp, this behavior is customizable.  By default though, the following steps occur.  First, the read-only version, and all outstanding transactional objects, are refreshed, so they reflect the new state of the database.  This happens transparently and is built into the JDOFactory object (which delegates the work to RefreshManager).  What about other applications running at the same time?  Stomp can be configured so the JDOFactory will  notify, asynchronously, all other interested JVMs about the id of the object(s) that changed, so they can follow suit and refresh the appropriate object(s).  A Kodo specific distribution implementation using JMS comes standard with Stomp.  When one JVM changes anything in the database, it notifies a JMS topic, which in turns passes the information to all other JVMs that are subscribed to the topic.  In this way, the read-only PM is able to build up a potentially large cache of database information, avoiding unnecessary database calls.  The database is checked exactly at the times that persistent information is changed, and only then.  This can offer tremendous performance improvements for many applications, and is highly efficient.

 

What about other caches in my application that depend on persistent data but are not themselves persistent?  The data refresh in persistent objects happens transparently, and many other application data caches may need to find out about the persistent data change, especially in the object itself.  This first framework solves this problem with the Persistent interface, which is effectively an extension of the PersistenceCapable interface provided by JDO.  Any objects implementing this interface will receive a dataChanged () notification after all the objects involved in the transaction have been refreshed.  This ensures that all persistent data is in sync with the datastore before any caches in the persistent objects are refreshed.  After this, the RefreshManager notifies any registered listeners about changes to objects, according to the priorities specified in those listeners.  So a change to the database in some other JVM results in first, transparently updating the persistent fields in the read-only objects so they are in sync with the database, second, notifying the persistent objects that this has been done, and third, notifying any other interested parties as well, in an application defined callback order.  The Persistent interface also provides an important tool for dealing with caches when objects are deleted, the getDeletedInfo () method.  This is a way for deleted objects to “come back from the grave” and give just enough info to other caches so they can figure out exactly what needs to be done.  The JDO object id is often times not useful in these situations, and having access to another simple field or two can make handling these events much easier.

 

Wow this is perfect.  Where do I sign up?  OK, I’m taking a few liberties with your response to these ideas.  More than likely you’re just trying to understand it all, and you won’t see any major problem areas until you start using the framework.  I’ll try to save you some time and point out the issues in advance (in effect, these problems were the motivation to develop the second framework in Stomp, more on that later):

 

 

-         Transaction code clutters up business logic in objects.  Highly repetitive try{ findInPm (…) } finally { pm.close () } blocks end up all over your source code.  If at some point you decide to alter the logic in these blocks ( say, to handle non-tx objects below ), you’ll likely need to change these try-finally blocks in all your objects.

-         Transient objects are difficult to deal with, especially if they have relations to pre-existing persistent objects.  The trouble is that JDO does not offer transparent ways to handle interactions between different PersistenceManagers.  When you go to make a transient object persistent, all objects reachable from the base object are also made persistent.  For this operation to succeed, any reachable persistent objects must be associated with PM doing the persisting.  More than likely though, the related objects will be associated with the read-only PM, leading to JDO Exceptions.  This problem is solved transparently by implementing the ‘findInPm’ method of the TransientTransactional interface, but doing this for all persistent objects is a chore.

-         Non-transactional objects can be tricky to handle.  Stomp supports a separate PM for getting “non-transactional” objects, which are objects whose data is loaded from the database, but whose data can be changed in a transient way in the current JVM.  These objects, coupled with transient objects, can be a big performance boost over comparable solutions with Entity beans, which are always persistent.  The drawback to using them is that the try{} finally{} blocks for dealing with transactions become even more complicated.

-         You need to implement the Persistent interface in your application code, so using this framework is not totally transparent.  Very likely, you’ll feel obligated to extend the AbstractPersistentObject base class, which implements much of the interface for you.

 

One thing to keep in mind: if you are going to use JDO, you should seriously consider using at least this first framework of Stomp.  The problems described here are not created by the framework, you’ll have to solve them regardless if you want to take advantage of JDO.  The framework does, however, solve several other problems you’ll have to tackle to use this spec, including: how persistence managers are going to managed, how data will be moved efficiently between JVMs, how application caches can be kept up-to-date with changes to persistent data,  etc.

 

So, you can take this first framework and be well on your way to a great JDO solution.  Of course, you don’t have to use transient objects, or non-transactional objects, but why not have everything?  For many applications, large performance gains can be made by avoiding these unnecessary trips to the database.  The second framework of Stomp, a method-level service layer, was designed specifically to deal with the restrictions listed above.

 

How does the method-level services framework help me?  The most glaring problem in the first framework is the repetitive try-finally transaction logic that clutters up business code.  An obvious next step then, is to try to find a way to move this logic into one place.  The service framework accomplishes this by using dynamic proxies to intercept method calls, effectively wrapping any transactional method call with the needed try-finally code (If you’re not familiar with dynamic proxies, you may want to take a minute and check out the java.lang.reflect package, in particular the Proxy object, and the InvocationHandler interface).  The stomp.service.ServiceFactory  getServiceLayer () method is able to produce one of these proxies for any persistent object. 

 

Using dynamic proxies, we are able to greatly simplify the logic in the business objects.  Dynamic proxies though, come with a few restrictions of their own.  They are:

 

-         Dynamic proxies only work for interfaces, so you’ll have to write an interface with all the public methods of your persistent objects.  I’d suggest throwing in a factory for instantiating and finding persistent objects as well, which means a bit of extra typing when you go to create a new persistent object.

 

The following are a direct consequence of the restriction above, but are stated here to make things clear:

-         You can’t cast a service layer to the particular implementation that is being wrapped.

-         You can’t call protected or private methods on serviced object.

 

Additionally, to pick up these services, your persistent objects need to implement a different interface, stomp.service.ServiceEnabled.  As with the persistent interface, there is an abstract base class which you will likely feel obligated to have all your persistent objects extend (AbstractServiceEnabled).  Let’s take a look at the previous Person-Pet example to see how it changes when implemented with framework #2:

 

public class PetImpl extends AbstractServiceEnabled implements Pet {

    private int _petCount = 0;

    public void pet ( Person petter ) {

        _petCount++;

        petter.increaseKarma ();

    }

 

    public Object findInPm ( PersistenceManager pm ) { return this; }

}

 

public class PersonImpl extends AbstractPersistentObject implements Person {

   private int _karma = 0;

   private PetImpl _myFriend;

 

   public void increaseKarma () {

        _karma++;

   }

 

   public void freeTime () {

        Pet myPet = (Pet) _myFriend.getServiceLayer ();

        myPet.pet ( this );

   }

 

    public Object findInPm ( PersistenceManager pm ) {

        _myFriend = (PetImpl) ((Pet) JDOFactory.singleton ().findInPm ( _myFriend, pm )).getServiceDelegate ();

        return this;

    }

}

 

Both of these classes have, in general, gotten much simpler.  We have, however, two additional classes to write:

 

public interface Pet extends ServiceEnabled {

    public void pet ( Person petter );

}

 

public interface Person extends ServiceEnabled {

    public void increaseKarma ();

    public void freeTime ();

}

 

Not such a big deal in this case, but it is extra work.  In general, I’ve found that writing the extra interface, coupled with a factory for finding / creating objects, usually pays off in the long run anyways.  It’s a good OO practice to do this, since you can substitute in different persistent implementations of the interfaces transparently.  Back to the example…

 

It’s clear that the business logic in Pet and Person has gotten much cleaner.  We are required to write interfaces for all our persistent objects, and to use these interfaces as our sole means of communication with the persistent objects, but that is probably not so bad.  A couple lines of code get more complicated, notably the ‘findInPm’ implementation in PersonImpl, which now has to jump through a few hoops to get a PetImpl associated with the correct PersistenceManager.  This type of logic is common when creating and finding persistent objects, which is why it is important to encapsulate it in a factory.  The example behaves just as it used to: someone calls person.freeTime ().  The person wraps it’s related Pet object in a ServiceLayer, then calls pet.pet (this).  The service layer will intercept this method call, start (or join) a transaction, find the Pet in the transactional PM, and find the Person argument in the same PM, invoking the ‘pet’ method on this transactional Pet object.  In this way, _petCount++ is legal, and the petter.increaseKarma () is being called on a transactional object, so the subsequent _karma++ is also legal.  Eventually, the pet() method ends, which returns control to the ServiceLayer, which closes (commits) the transactional PersistenceManager.  The read-only versions of _karma and _petCount are updated, and execution continues.

 

Anyone that use EJBs will be perfectly comfortable using interfaces and factories to access persistent objects, as this is exactly the way the EJB spec works (the Home interface is the factory, the Remote interface is the business method interface, and the bean is your implementation code).  You’ll have to be comfortable with this restriction as well if you want to use this framework (the bytecode enhancer gets around this as well, more on that later).

 

If you adopt this second framework for persisting your objects, you’ll find that you still have a few restrictions (in addition to those mentioned for using dynamic proxies above).  They are:

 

-         Transient objects are still a pain to persist.  You still have to implement ‘findInPm’ in all your persistent objects.

-         The framework still isn’t transparent to your business code, and you’ll probably find that implementing ServiceEnabled obliges you even more to extend a Stomp-specific base class (in this case, AbstractServiceEnabled).

 

The problems with non-transactional objects have disappeared though, as well as all the repetitive handling of transactions, which is a big improvement.  I’ve written many persistent objects with this exact framework, and found it to be very effective.  The restrictions are easily worked around, and implementing some interface or extending some base class is probably something you would have done anyways.  The EJB spec, for example, has no problem with requiring developers to write Remote interfaces and to extend particular base classes, and it is an incredibly popular development option.

 

By the way… the service framework was designed to add the service of starting and stopping transactions to method calls.  It didn’t take long to realize though that any method-level service could be plugged into this system.  The ServiceFactory creates a ServiceLayer by looking up a services.xml file for the object being serviced.  If it’s not found, persistence is added as a sole service, but you can define this file and add your own custom services to methods very easily.  Benchmarking can be added during development to track down performance problems, remote logging can be turned on from a help-desk to deal with a user complaint, etc.

 

All these advantages are yours if you choose to use this framework.  You can take another step though, and go for the holy grail… completely transparent object persistence.

 

What is the Stomp Enhancer, and how does it work?  The stomp.enhance package includes a bytecode enhancer, which takes normal java objects and makes them utilize them transparently use the two frameworks described above.  The enhancer takes a normal java object, like Pet.class, and produces three new class files which function together, a proxy, a delegate, and an interface.  The proxy stores a reference to a delegate, wrapped in a service layer from framework #2.  The delegate is altered to extend the needed base classes and to implement findInPm (etc).  The interface is used to make the dynamic proxy based service layer function correctly.  The result is that all the remaining restrictions of the second framework are removed… objects can be persisted in a totally transparent way.  Let’s look at our running example to see what it will now look like:

 

public class Pet {

    private int _petCount = 0;

    public void pet ( Person petter ) {

        _petCount++;

        petter.increaseKarma ();

    }

}

 

public class Person {

   private int _karma = 0;

   private Pet _myFriend;

 

   public void increaseKarma () {

        _karma++;

   }

 

   public void freeTime () {

        _myFriend.pet (this);

   }

}

 

That’s right… it’s just as simple as could be.  So how does it work?  All that transaction logic has to happen somehow, and it still does.  It’s just added, transparently, to the objects after they are compiled.  Here’s a general description of the runtime execution of a person.freeTime () call:  in the source code, freeTime () calls _myFriend.pet (this).  The enhancer changes _myFriend into a method call to obtain the relation.  This method is added by the enhancer, and the implementation of the method is to add a service layer to the related object, then return it.  So _myFriend.pet (), at runtime, looks more like get_MyFriend ().pet (), where get_MyFriend () returns _myFriend, but wrapped in a service layer.

 

Now the pet (this) call is intercepted by the service layer, and a transaction is started (or joined) the objects are associated with the correct PersistenceManager, etc etc.  When the method completes, exactly what you would intuitively hope for has happened: the person object has had it’s _karma increased ( the read-only delegate is updated when the transaction commits ), the database has been changed, and any other JVM with the same person object has been notified of this change.  All with the basic java source code above.

 

You won’t be surprised to learn that there are a few restrictions with the enhancer… however, virtually all of them are a result of the new-ness of the technology and will likely to be removed in time.  Here they are:

 

-         Known Restrictions

 

These issues are typically very easy to work around, and will be removed in time.  For these drawbacks, Stomp gives you all the power described in the earlier frameworks, but in a totally transparent way.  Transient objects become much easier to work with, as Stomp implements findInPm for you.  When transactions commit, Stomp can refind the delegate object in the read-only persistenceManager, which means old transactional objects do not become defunct at commit time.  You are not required (though still encouraged) to write interfaces and factories for all your persistent objects.  The list goes on.

 

For those applications that can handle these restrictions, Stomp can be an ideal solution.  Your java source code is absolutely free from persistence logic, and you can add method-level services in a completely transparent way.  You can turn on benchmarking with a simple change to the services.xml file and don’t even have to recompile.  Future releases may allow you to alter the method-level services for object classes or even particular instances efficiently at runtime.

 

The best part is, Stomp is open-source, so you get the benefits of being able to fix critical bugs yourself.  You’ll have other people working on the details for you, so it’s like getting a free team of people to solve your persistence, transaction, data distribution and method-service problems all at once.  You can really focus on the writing the particular code that you have to write, and only that code.  Best of all, one advanced programmer in your group can learn the inner-workings of stomp, tweaking and improving it to meet the needs of your application, while all the other developers work out simple, business-logic-oriented java classes.  This division of labor has been a goal of many Sun specs, particularly the EJB spec.  While these past releases have been somewhat successful in achieving this aim, Stomp takes this idea to a new level.  If Stomp succeeds, your productivity will soar.

 

OK, it’s not completely transparent for all developers.  You do have to tell Stomp when you want to make an object persistent, and when you want to load on object from the database.  But come on, of course you’ve got interact with something to that extent…  To do this, you’ll use the JDOFactory.persistNewObject () method to put things in the database, and use the JDO query language in combination with JDOFactory.newQuery () to get a JDO Query to load objects.  These APIs are extremely simple and can be picked up in an hour or two by any developer.  Other than this, all the rest of the code is yours.

 

OK, I’m convinced. How do I get started?  If you are writing an application from scratch, you have to try out Stomp.  The code you have to write to use Stomp is trivial, and you’ll have to write all the rest of the code anyways.  If you don’t like Stomp for any reason, you can drop it and use some other persistence mechanism with a minimal loss of time.  Check out an example or two to get a feel for and learn the basics of getting an object stomped, but then just dive in using Stomp on your main project.

 

If you’re thinking of converting an existing project, spend some time playing around with the sample code.  You’ll probably want to write a few fairly complicated examples to see how it performs against your chosen solution.  Hopefully you’ll like it enough to switch.  If not, please share your complaints, as they help to push Stomp development forward.  Of course, I’d love to hear about your successes with Stomp as well.

 

Write your source code.  Stomp it.  Deploy it.  Enjoy.