Welcome to Part 3 in my series on Caching Enhancements in ColdFusion 9. In Part 2, we talked about caching granularity. This time around, were going to spend some time discussing caching architectures. When talking about caching architectures, it's important to understand the type of cache being referred to. Basically, caches come in two flavors: in-process and out-of-process.

An in-process cache operates in the same process as its host application server. As I mentioned in Part 1 of this series, the new caching functionality in ColdFusion 9 is based on an implementation of Ehcache. Because Ehcache is an in-process caching provider that means that the cache operates in the same JVM as the ColdFusion server. The biggest advantage to an in-process cache is that it's lightning fast as data/object serialization is generally not required when writing to or reading from the cache. On the other side of the coin, in-process caches have limitations that you need to be aware of when it comes to system memory - particularly if you're on a 32-bit platform or a system that's light on RAM. On 32-bit systems, the JVM is typically limited to between 1.2GB and 2GB of RAM, depending on platform (although some 32-bit JVM's running on 64-bit systems may be able to use up to 4GB of RAM). Because you have to share this with your application server, that leaves considerably less RAM available to your cache.

In-process caches can be scaled up by adding more RAM, but not out by adding more servers as each cache is local to the application server's JVM it's deployed with. We'll discuss this in more depth when we talk about clustered caching. When using an in-process cache you always need to be aware of the number of items you'll be caching and how much RAM they take up to avoid a sudden spike in cache evictions if the available memory to both your application server and cache tops out. Fortunately for ColdFusion, Ehcache can be configured so that it fails over from RAM based storage to disk in the event that the cache fills up.

Out-of-process caches, like their name suggests, run outside of the same process as the application server. In the Java world, they run inside their own JVM. Out-of-process caches tend to be highly scalable on both 32-bit and 64-bit platforms as they scale both out and up. If you need to scale an out-of-process cache, you simply install more instances of the cache on any machines with spare RAM on your network. The main drawback to out-of-process caches is speed. Data and objects being written to and read from an out-of-process cache must be serialized and deserialized. Although the overhead for doing so is relatively small, it's still considerable enough to have an impact on performance.

Although Ehacahe itself is not an out-of-process cache, it does come with something called Ehcache Server which is available as a WAR file that can be run with most popular web containers or standalone. The Ehcache server has both SOAP and REST based web services API's for cache reads/writes. Another example of an out-of-process cache is the ever popular Memcached.

Now that we've covered the basics of in-process and out-of-process caches, it's time to make things a little more complicated by adding distributed caching and cache clustering to the mix. My experience over the last few years with caching has been that the term distributed tends to be a catch-all for what most would consider a true distributed cache as well as for a clustered cache. Confused yet? Let me attempt to clarify. Most of you are probably already familiar with how clustering works. In the application server world, you take an application server such as ColdFusion and you deploy it on two or more identically configured machines (or you can deploy multiple instances to one or more machines) which you then tie together through hardware and/or software. The result is that you are able to distribute load to your application across multiple servers which allows you to scale your application out. Need to be able to support more users? Add more servers to the cluster. It's the same for caching. If you have an in-process cache, you can't make the cache hold more items

When it comes to cache clustering, the primary reason for doing so is usually that you already have or are planning to deploy your application on a cluster. If you have a clustered application that needs to make use of caching, the first problem you face is that each application server has its own in-process cache which is local to the server. If Server A writes a piece of data to its in-process cache, that data is not available to Server B. This might not be a big deal for some clustered applications that implement sticky sessions, have light load or have data that doesn't necessarily need to be synchronized, but it becomes a serious problem for clusters that are configured for failover, have heavier load, or have cached data that needs to be in synch across every server in the cluster. In these instances, standalone in-process caching doesn't work well. The solution is to cluster your in-process caches as well as your application server. In the case of ColdFusion 9, the underlying Ehcache implementation fully supports caching. When configured, each local cache automatically replicates its content via RMI, JMS, JGroups, TerraCotta, or other plugable mechanisms to all other caches specified in the configuration. There's a small amount of latency while the data replicates but it's negligible in all but the most extreme use cases. I have set this up, tested, and verified it works with the ColdFusion 9 implementation. I'll put up a detailed post of exactly how to do this in a future blog post. The important thing to understand here is that clustering of in-process caches gets you redundancy, but the limit on the size of a single cache is still the limiting factor on scalability (e.g. if the cache you want to cluster has a limit of 500MB of data, clustering the cache between two servers means you are still limited to that 500MB of data in the cache, only now it's stored on two different servers).

Distributed caching differs from clustered caching in that a distributed cache is essentially one gigantic out-of-process cache spread across multiple machines. If you think of a clustered cache as comparable to a clustered application server then a distributed cache is much like a computing grid. Whereas a clustered cache gets you redundancy, a distributed cache gets you horizontal scalability with respect to how much data or how many objects can be put in the cache. Different distributed caching providers handle the exact caching mechanics differently, but the basics remain the same. If you need redundancy in a distributed cache, many distributed caching providers, including Ehcache Server let you cluster distributed cache nodes. The following diagram illustrates how a distributed, out-of-process cache cluster using Ehcache Server might look.

You should note that this is just one of many possible configurations. Using a combination of hardware and software it's possible to build out some pretty sophisticated caching architectures depending on your performance, scalability and redundancy requirements. It's even possible to create hybrid in-process/out-of-process architectures using solutions such as Terracotta.

That's about it for caching architectures. If you want to learn more, a fantastic resource is the website High Scalability. I hope you continue to find this series helpful. In Part 4 we'll cover our last foundation topic - the basics of caching strategies, before moving into ColdFusion 9's specific Ehcache implementation.