Rob Brooks-Bilson
Tech, Photography, Stuff
Tech, Photography, Stuff
September 21, 2009
So far in this series, we've covered why you would want to cache, what to cache and when, and basic caching architectures. In part 4 of this series, we're going to talk about caching strategies and eviction policies.
A caching strategy is nothing more than an architectural decision on how you're going to manage putting data in and retrieving data from your cache and the corresponding relationship between the cache and your backend data source. There are two main caching strategies you need to be aware of, deterministic and non-deterministic.
A non-deterministic caching strategy involves first looking in the cache for the object or data you want to retrieve. If it's there, your application uses the cached copy. If it's not there, you must then query the backend system for the object or data you want to retrieve. This is by far the most popular caching strategy as it's relatively simple to implement and is very flexible.
A deterministic caching strategy is one in which you always go to the cache for the object or data that you need. It's assumed that if it's not in the cache, then it doesn't exist. This strategy requires that your cache be pre-populated with data as there's no mechanism for a cache miss to query the backend system for the missing object or data.
Both deterministic and non-deterministic caching strategies have their pros and cons. For non-deterministic caching, the upside is that it's simple to implement in code and you have a lot of flexibility in how you do this. The downside to this caching strategy is an issue called stampeding requests, otherwise known as the dog pile. This occurs, usually under load, when a cache miss results in multiple threads simultaneously querying the backend system for the missing cache data. Under this scenario, it's very easy to overwhelm the backend system with requests as the database struggles to fetch the data and repopulate the cache. There are various ways that you can code around this, which we'll discuss later on in another blog post. For now, it's just important to realize that it can happen.
For the purposes of the rest of this discussion as well as the rest of the series, we'll be focusing on non-deterministic caching. That said; let's now turn our attention to cache eviction algorithms. Think of a cache like a box. A box has a limit on how much stuff it can hold before things start falling out when you try to pile on more. A cache is the same way when it comes to the objects and data you store in it – eventually it runs out of room.
Cache eviction policies can be broken down in to two categories: time based and cost based. Time based policies let you associate a time period or an expiration date for individual cache items. This lets you do things like keep an item in the cache for 6 hours, or 30 days, or until December 15, 2040 at 10:00pm. When a request is made to a cache that contains items with time based expirations, the cache first checks to see if the item is expired. If it is, the item is evicted from the cache and is not returned to the operation that called it (most caches simply return null).
Cost based eviction policies work a little differently. A cost based eviction policy doesn't kick in until a cache is full and needs to kick some items out (evict) before allowing new ones in. Most caches give you several cost based eviction policies to choose from. In this scenario, when you attempt to put a new item in the cache, the cache first looks to see if it's full. If it is, it runs whatever cost based eviction policy has been set for the cache and evicts the appropriate item(s). The following are some of the most common cost based eviction policies you'll encounter:
First In First Out (FIFO): The first item that was placed in the cache is the first item to be evicted when it becomes full. It's essential to remember that the first item in the cache is not necessarily the least important. If the first item in your cache is also the most frequently accessed item you might want to think twice about implementing an eviction policy that would result in evicting it from the cache first in the event the cache fills up.
Least Recently Used (LRU): This policy implements an algorithm to track which items in the cache are the least frequently accessed. Various cache providers implement this algorithm in different ways but the result is that the items in the cache that haven't been used in a while are evicted first.
Less Frequently Used (LFU): This algorithm is unique to Ehcahe. It uses a random sampling of items in the cache and picks the item with the lowest number of hits to evict. The Ehcache documentation claims that an element in the lowest quartile of use is evicted 99.99% of the time with this algorithm. In a cache that follows a Pareto distribution (20% of the items in the cache account for 80% of the requests) this algorithm may offer better performance than LRU. For more detailed discussion of various cache eviction algorithms, see the cache algorithms page on Wikipedia.
That's about it for this post on caching strategies and eviction policies. In Part 5 of this series, we'll finally start to take a look at caching in ColdFusion including what's always been there and what's new in ColdFusion 9.
A quick little plug: If you're heading to Adobe MAX 2009 in LA this October and want to know more about caching in ColdFusion 9, check out my session on Advanced ColdFusion Caching Strategies where I'll be covering a lot of what's already been discussed on my blog as well as a whole bunch of new material. I hope to see you there!
September 3, 2009
Welcome to Part 3 in my series on Caching Enhancements in ColdFusion 9. In Part 2, we talked about caching granularity. This time around, were going to spend some time discussing caching architectures. When talking about caching architectures, it's important to understand the type of cache being referred to. Basically, caches come in two flavors: in-process and out-of-process.
An in-process cache operates in the same process as its host application server. As I mentioned in Part 1 of this series, the new caching functionality in ColdFusion 9 is based on an implementation of Ehcache. Because Ehcache is an in-process caching provider that means that the cache operates in the same JVM as the ColdFusion server. The biggest advantage to an in-process cache is that it's lightning fast as data/object serialization is generally not required when writing to or reading from the cache. On the other side of the coin, in-process caches have limitations that you need to be aware of when it comes to system memory - particularly if you're on a 32-bit platform or a system that's light on RAM. On 32-bit systems, the JVM is typically limited to between 1.2GB and 2GB of RAM, depending on platform (although some 32-bit JVM's running on 64-bit systems may be able to use up to 4GB of RAM). Because you have to share this with your application server, that leaves considerably less RAM available to your cache.

In-process caches can be scaled up by adding more RAM, but not out by adding more servers as each cache is local to the application server's JVM it's deployed with. We'll discuss this in more depth when we talk about clustered caching. When using an in-process cache you always need to be aware of the number of items you'll be caching and how much RAM they take up to avoid a sudden spike in cache evictions if the available memory to both your application server and cache tops out. Fortunately for ColdFusion, Ehcache can be configured so that it fails over from RAM based storage to disk in the event that the cache fills up.
Out-of-process caches, like their name suggests, run outside of the same process as the application server. In the Java world, they run inside their own JVM. Out-of-process caches tend to be highly scalable on both 32-bit and 64-bit platforms as they scale both out and up. If you need to scale an out-of-process cache, you simply install more instances of the cache on any machines with spare RAM on your network. The main drawback to out-of-process caches is speed. Data and objects being written to and read from an out-of-process cache must be serialized and deserialized. Although the overhead for doing so is relatively small, it's still considerable enough to have an impact on performance.

Although Ehacahe itself is not an out-of-process cache, it does come with something called Ehcache Server which is available as a WAR file that can be run with most popular web containers or standalone. The Ehcache server has both SOAP and REST based web services API's for cache reads/writes. Another example of an out-of-process cache is the ever popular Memcached.
Now that we've covered the basics of in-process and out-of-process caches, it's time to make things a little more complicated by adding distributed caching and cache clustering to the mix. My experience over the last few years with caching has been that the term distributed tends to be a catch-all for what most would consider a true distributed cache as well as for a clustered cache. Confused yet? Let me attempt to clarify. Most of you are probably already familiar with how clustering works. In the application server world, you take an application server such as ColdFusion and you deploy it on two or more identically configured machines (or you can deploy multiple instances to one or more machines) which you then tie together through hardware and/or software. The result is that you are able to distribute load to your application across multiple servers which allows you to scale your application out. Need to be able to support more users? Add more servers to the cluster. It's the same for caching. If you have an in-process cache, you can't make the cache hold more items
When it comes to cache clustering, the primary reason for doing so is usually that you already have or are planning to deploy your application on a cluster. If you have a clustered application that needs to make use of caching, the first problem you face is that each application server has its own in-process cache which is local to the server. If Server A writes a piece of data to its in-process cache, that data is not available to Server B. This might not be a big deal for some clustered applications that implement sticky sessions, have light load or have data that doesn't necessarily need to be synchronized, but it becomes a serious problem for clusters that are configured for failover, have heavier load, or have cached data that needs to be in synch across every server in the cluster. In these instances, standalone in-process caching doesn't work well. The solution is to cluster your in-process caches as well as your application server. In the case of ColdFusion 9, the underlying Ehcache implementation fully supports caching. When configured, each local cache automatically replicates its content via RMI, JMS, JGroups, TerraCotta, or other plugable mechanisms to all other caches specified in the configuration. There's a small amount of latency while the data replicates but it's negligible in all but the most extreme use cases. I have set this up, tested, and verified it works with the ColdFusion 9 implementation. I'll put up a detailed post of exactly how to do this in a future blog post. The important thing to understand here is that clustering of in-process caches gets you redundancy, but the limit on the size of a single cache is still the limiting factor on scalability (e.g. if the cache you want to cluster has a limit of 500MB of data, clustering the cache between two servers means you are still limited to that 500MB of data in the cache, only now it's stored on two different servers).

Distributed caching differs from clustered caching in that a distributed cache is essentially one gigantic out-of-process cache spread across multiple machines. If you think of a clustered cache as comparable to a clustered application server then a distributed cache is much like a computing grid. Whereas a clustered cache gets you redundancy, a distributed cache gets you horizontal scalability with respect to how much data or how many objects can be put in the cache. Different distributed caching providers handle the exact caching mechanics differently, but the basics remain the same. If you need redundancy in a distributed cache, many distributed caching providers, including Ehcache Server let you cluster distributed cache nodes. The following diagram illustrates how a distributed, out-of-process cache cluster using Ehcache Server might look.

You should note that this is just one of many possible configurations. Using a combination of hardware and software it's possible to build out some pretty sophisticated caching architectures depending on your performance, scalability and redundancy requirements. It's even possible to create hybrid in-process/out-of-process architectures using solutions such as Terracotta.
That's about it for caching architectures. If you want to learn more, a fantastic resource is the website High Scalability. I hope you continue to find this series helpful. In Part 4 we'll cover our last foundation topic - the basics of caching strategies, before moving into ColdFusion 9's specific Ehcache implementation.