Caching Enhancements in ColdFusion 9 – Part 5: Getting to Know Ehcache

In previous versions of ColdFusion (before ColdFusion 9 that is), there were three built-in mechanisms you could for caching – persistent variable scopes, query caching and the cfcache tag. As I mentioned in Part 2 of the series, each of these methods has inherent limitations and generally requires a good deal of additional programming to gain any semblance of control over the actual cache – if that is even possible. In many cases it really isn't practical or even possible to see how much information is stored in one of the aforementioned caching mechanisms. All of this changes with the introduction of Ehcache as the underlying cache provider in ColdFusion 9. Five parts into this series, and I just realized I've mentioned Ehcache several times but I never really took the time to talk about what exactly it is and how it's been implemented in ColdFusion 9.

The Ehcache project was started by a gentleman named Greg Luck. Ehcache can best be described as "... a widely used Java distributed cache for general purpose caching, Java EE, and lightweight containers." Caches are implemented as key-value stores, much like a ColdFusion structure. One important feature of an Ehcache cache is that it can be persisted to memory, disk, or both. Optionally, memory caches can be configured to survive JVM restarts (or in our case, ColdFusion server restarts). Caches can be configured via an XML configuration file named ehcache.xml or programmatically at runtime. Ehcache ships as a JAR file and runs in-process within an application server's JVM. In the case of ColdFusion 9, the JAR file and the XML file used to configure it (ehcache.xml) are both located in:

/JRun4/servers/server_name/cfusion-ear/cfusion-war/WEB-INF/cfusion/lib

ColdFusion 9 implements three types of caches using the Ehcache engine: Template Caches, Object caches and Hibernate Caches. Each of these cache types has their own use cases and we'll cover them in depth in future blog posts. For now, here's a quick summary of what each cache type is and generally what it's used for.

The template cache is designed for caching entire web pages as well as page fragments (sections of web pages). You shouldn't confuse this with the template cache mentioned in the ColdFusion Administrator. That template cache is concerned with compiled ColdFusion templates stored in memory. It's unfortunate that this dual use of the term "template cache" exists in ColdFusion so you need to be aware of which type of cache is being referred to when you see template cache mentioned. In terms of the Ehcache template cache implementation in ColdFusion 9, it's a fairly automatic process with ColdFusion handling the bulk of the work managing keys, cache gets/puts and expiry/eviction from the cache. Working with the template cache is done exclusively using the cfcache tag. Here's a very basic example:

view plain print about
1<cfoutput>
2I'm real-time dynamic data #now()# <br/>
3</cfoutput>
4
5
6<cfcache action="serverCache">
7<cfoutput>
8I'm cached dynamic data: #now()# <br/>
9</cfoutput>
10</cfcache>
11
12<cfoutput>
13I'm also real-time dynamic data #now()# <br/>
14</cfoutput>
15
16<cfcache action="serverCache">
17<cfoutput>
18I'm cached dynamic data too: #now()# <br/>
19</cfoutput>
20</cfcache>

In this code, the first section displays the current date/time. The second section uses the cfcache tag with action="serverCache". This action specifies that we want to use server side caching (Ehcache) and is now the default action in ColdFusion 9. To cache a fragment of code in a ColdFusion page we simply have to wrap it in a cfcache block. The next section of code is again entirely dynamic. The fourth and final section is wrapped in another set of cfcache tags. When you execute this code for the first time, the output will show the same date/time for all four sections of code, like this:

view plain print about
1I'm real-time dynamic data {ts '2009-11-16 10:49:50'}
2I'm cached dynamic data: {ts '2009-11-16 10:49:50'}
3I'm also real-time dynamic data {ts '2009-11-16 10:49:50'}
4I'm cached dynamic data too: {ts '2009-11-16 10:49:50'}

Running the code a few seconds later has different results:

view plain print about
1I'm real-time dynamic data {ts '2009-11-16 10:51:16'}
2I'm cached dynamic data: {ts '2009-11-16 10:49:50'}
3I'm also real-time dynamic data {ts '2009-11-16 10:51:16'}
4I'm cached dynamic data too: {ts '2009-11-16 10:49:50'}

In this case, the output contains both dynamic data (the first and third lines) as well as cached data (the second and fourth lines). Using the template cache it's easy to see how you can mix both dynamic data and multiple fragments of data that needs to be cached on the same page. There's really not a whole lot for you to do. You simply wrap the content you want to cache in a cfcache block and ColdFusion handles the rest. There are a few things you can control like cache timeouts, but we'll cover that in a full blog post dedicated to the template cache and all of it's features.

The object cache gives you much more granular control over what you cache than the template cache does. Using the object cache, you can pretty much cache anything you want – simple values, complex variables, objects, files, and just about anything else you want to throw at it. The advantage to using the object cache is that you have complete control over key names, get/put/remove operations, and cache expiry/eviction. This takes a little more work on your part but what you give up in convenience you easily gain back in flexibility and control. You can work with the object cache using both the cfcache tag as well as the new caching functions introduced in ColdFusion 9. Here's a very basic example of how you can cache the results of a query in an object cache using the cfcache tag:

view plain print about
1<cfcache
2    action="get"
3    id="artistQuery"
4    name="getArtists">

5
6<!--- go to the cache. If the data is not there,
7 go to the db then repopulate the cache --->

8<cfif isNull(getArtists)>
9
10    <!--- call getter then setter to retrieve new
11     value then update the cache --->

12    <cfquery name="getArtists" datasource="cfartgallery">
13        SELECT *
14        from artists
15    </cfquery>
16    
17    <cfcache
18        action="put"
19        id="artistQuery"
20        value="#getArtists#"
21        timespan="#createTimeSpan(0,0,1,0)#">

22</cfif>
23
24<h3>Query:</h3>
25<cfdump var="#getArtists#">

In this code, the first thing we do is try to pull our query out of the cache using the cfcache tag with action="get". The key that identifies our cached query in the object cache is set using id="artistQuery". Of course the first time we run this the query won't be in the cache, so the result variable we specified (name="getArtists") will return Null. After we attempt to pull the query from the cache, we then use the isNull function to see if anything was returned from our get operation. If getArtists is Null, we know that the query data isn't in the cache so we need to run our query, retrieve the data and then place it in the cache using the cfcache tag with action="put". When we do our cache put, we set id="artistQuery", which is the name of our cache key. The value to store in the cache is our ColdFusion query (value="getArtists") and we can specify how long the data should be cached using the timespan argument. The first time you execute the page (with debugging turned on), you'll notice that there's debugging information for your executed SQL Queries – because the query data didn't exist in the cache yet and you told ColdFusion to go off and get it from the database and put it into the cache. If you execute the page again, you won't see any debug info for your SQL Queries because ColdFusion is pulling the data from the cache and not from the database. It's important to note that in the dump of the query data from this example, Cached will always be False, regardless of whether you are accessing cached data or not. This is because Cached refers to whether or not the result set is in the ColdFusion Query Cache, which it is not because we are using Coldfusion's Ehcache implementation here and not the built in query cache. There's a lot more you can do with the object cache which we'll cover in another blog post.

If you're using ColdFusion 9's new Hibernate ORM functionality you can configure the ORM such that it uses Hibernate as its second level cache. This is a more complicated topic that we have time to discuss now, so let's table it for another blog post.

When you create a ColdFusion application and your application uses an Application.cfc or Application.cfm file, ColdFusion automatically creates a template cache and an object cache for you. These caches are bound to your application name (defined in your cfapplication tag for an Application.cfm file or this.name if you are using Application.cfc). If you have an unnamed application, ColdFusion will create a cache that is shared and accessible among all unnamed applications. It's important to remember that caches are not tied to ColdFusion scopes. If your application times out, this does not affect the cache(s) you create with Ehcache.

That's about it for getting to know the basics of Ehcache in ColdFusion 9. In Part 6 we'll start looking at the ehcache.xml file and how it can be used to configure the behavior of the caches you create in ColdFusion 9.

Caching Enhancements in ColdFusion 9 - Part 4: Caching Strategies & Eviction Policies

So far in this series, we've covered why you would want to cache, what to cache and when, and basic caching architectures. In part 4 of this series, we're going to talk about caching strategies and eviction policies.

A caching strategy is nothing more than an architectural decision on how you're going to manage putting data in and retrieving data from your cache and the corresponding relationship between the cache and your backend data source. There are two main caching strategies you need to be aware of, deterministic and non-deterministic.

A non-deterministic caching strategy involves first looking in the cache for the object or data you want to retrieve. If it's there, your application uses the cached copy. If it's not there, you must then query the backend system for the object or data you want to retrieve. This is by far the most popular caching strategy as it's relatively simple to implement and is very flexible.

A deterministic caching strategy is one in which you always go to the cache for the object or data that you need. It's assumed that if it's not in the cache, then it doesn't exist. This strategy requires that your cache be pre-populated with data as there's no mechanism for a cache miss to query the backend system for the missing object or data.

Both deterministic and non-deterministic caching strategies have their pros and cons. For non-deterministic caching, the upside is that it's simple to implement in code and you have a lot of flexibility in how you do this. The downside to this caching strategy is an issue called stampeding requests, otherwise known as the dog pile. This occurs, usually under load, when a cache miss results in multiple threads simultaneously querying the backend system for the missing cache data. Under this scenario, it's very easy to overwhelm the backend system with requests as the database struggles to fetch the data and repopulate the cache. There are various ways that you can code around this, which we'll discuss later on in another blog post. For now, it's just important to realize that it can happen.

For the purposes of the rest of this discussion as well as the rest of the series, we'll be focusing on non-deterministic caching. That said; let's now turn our attention to cache eviction algorithms. Think of a cache like a box. A box has a limit on how much stuff it can hold before things start falling out when you try to pile on more. A cache is the same way when it comes to the objects and data you store in it – eventually it runs out of room.

Cache eviction policies can be broken down in to two categories: time based and cost based. Time based policies let you associate a time period or an expiration date for individual cache items. This lets you do things like keep an item in the cache for 6 hours, or 30 days, or until December 15, 2040 at 10:00pm. When a request is made to a cache that contains items with time based expirations, the cache first checks to see if the item is expired. If it is, the item is evicted from the cache and is not returned to the operation that called it (most caches simply return null).

Cost based eviction policies work a little differently. A cost based eviction policy doesn't kick in until a cache is full and needs to kick some items out (evict) before allowing new ones in. Most caches give you several cost based eviction policies to choose from. In this scenario, when you attempt to put a new item in the cache, the cache first looks to see if it's full. If it is, it runs whatever cost based eviction policy has been set for the cache and evicts the appropriate item(s). The following are some of the most common cost based eviction policies you'll encounter:

First In First Out (FIFO): The first item that was placed in the cache is the first item to be evicted when it becomes full. It's essential to remember that the first item in the cache is not necessarily the least important. If the first item in your cache is also the most frequently accessed item you might want to think twice about implementing an eviction policy that would result in evicting it from the cache first in the event the cache fills up.

Least Recently Used (LRU): This policy implements an algorithm to track which items in the cache are the least frequently accessed. Various cache providers implement this algorithm in different ways but the result is that the items in the cache that haven't been used in a while are evicted first.

Less Frequently Used (LFU): This algorithm is unique to Ehcahe. It uses a random sampling of items in the cache and picks the item with the lowest number of hits to evict. The Ehcache documentation claims that an element in the lowest quartile of use is evicted 99.99% of the time with this algorithm. In a cache that follows a Pareto distribution (20% of the items in the cache account for 80% of the requests) this algorithm may offer better performance than LRU. For more detailed discussion of various cache eviction algorithms, see the cache algorithms page on Wikipedia.

That's about it for this post on caching strategies and eviction policies. In Part 5 of this series, we'll finally start to take a look at caching in ColdFusion including what's always been there and what's new in ColdFusion 9.

A quick little plug: If you're heading to Adobe MAX 2009 in LA this October and want to know more about caching in ColdFusion 9, check out my session on Advanced ColdFusion Caching Strategies where I'll be covering a lot of what's already been discussed on my blog as well as a whole bunch of new material. I hope to see you there!

Caching Enhancements in ColdFusion 9 - Part 3: Caching Architectures

Welcome to Part 3 in my series on Caching Enhancements in ColdFusion 9. In Part 2, we talked about caching granularity. This time around, were going to spend some time discussing caching architectures. When talking about caching architectures, it's important to understand the type of cache being referred to. Basically, caches come in two flavors: in-process and out-of-process.

An in-process cache operates in the same process as its host application server. As I mentioned in Part 1 of this series, the new caching functionality in ColdFusion 9 is based on an implementation of Ehcache. Because Ehcache is an in-process caching provider that means that the cache operates in the same JVM as the ColdFusion server. The biggest advantage to an in-process cache is that it's lightning fast as data/object serialization is generally not required when writing to or reading from the cache. On the other side of the coin, in-process caches have limitations that you need to be aware of when it comes to system memory - particularly if you're on a 32-bit platform or a system that's light on RAM. On 32-bit systems, the JVM is typically limited to between 1.2GB and 2GB of RAM, depending on platform (although some 32-bit JVM's running on 64-bit systems may be able to use up to 4GB of RAM). Because you have to share this with your application server, that leaves considerably less RAM available to your cache.

In-process caches can be scaled up by adding more RAM, but not out by adding more servers as each cache is local to the application server's JVM it's deployed with. We'll discuss this in more depth when we talk about clustered caching. When using an in-process cache you always need to be aware of the number of items you'll be caching and how much RAM they take up to avoid a sudden spike in cache evictions if the available memory to both your application server and cache tops out. Fortunately for ColdFusion, Ehcache can be configured so that it fails over from RAM based storage to disk in the event that the cache fills up.

Out-of-process caches, like their name suggests, run outside of the same process as the application server. In the Java world, they run inside their own JVM. Out-of-process caches tend to be highly scalable on both 32-bit and 64-bit platforms as they scale both out and up. If you need to scale an out-of-process cache, you simply install more instances of the cache on any machines with spare RAM on your network. The main drawback to out-of-process caches is speed. Data and objects being written to and read from an out-of-process cache must be serialized and deserialized. Although the overhead for doing so is relatively small, it's still considerable enough to have an impact on performance.

Although Ehacahe itself is not an out-of-process cache, it does come with something called Ehcache Server which is available as a WAR file that can be run with most popular web containers or standalone. The Ehcache server has both SOAP and REST based web services API's for cache reads/writes. Another example of an out-of-process cache is the ever popular Memcached.

Now that we've covered the basics of in-process and out-of-process caches, it's time to make things a little more complicated by adding distributed caching and cache clustering to the mix. My experience over the last few years with caching has been that the term distributed tends to be a catch-all for what most would consider a true distributed cache as well as for a clustered cache. Confused yet? Let me attempt to clarify. Most of you are probably already familiar with how clustering works. In the application server world, you take an application server such as ColdFusion and you deploy it on two or more identically configured machines (or you can deploy multiple instances to one or more machines) which you then tie together through hardware and/or software. The result is that you are able to distribute load to your application across multiple servers which allows you to scale your application out. Need to be able to support more users? Add more servers to the cluster. It's the same for caching. If you have an in-process cache, you can't make the cache hold more items

When it comes to cache clustering, the primary reason for doing so is usually that you already have or are planning to deploy your application on a cluster. If you have a clustered application that needs to make use of caching, the first problem you face is that each application server has its own in-process cache which is local to the server. If Server A writes a piece of data to its in-process cache, that data is not available to Server B. This might not be a big deal for some clustered applications that implement sticky sessions, have light load or have data that doesn't necessarily need to be synchronized, but it becomes a serious problem for clusters that are configured for failover, have heavier load, or have cached data that needs to be in synch across every server in the cluster. In these instances, standalone in-process caching doesn't work well. The solution is to cluster your in-process caches as well as your application server. In the case of ColdFusion 9, the underlying Ehcache implementation fully supports caching. When configured, each local cache automatically replicates its content via RMI, JMS, JGroups, TerraCotta, or other plugable mechanisms to all other caches specified in the configuration. There's a small amount of latency while the data replicates but it's negligible in all but the most extreme use cases. I have set this up, tested, and verified it works with the ColdFusion 9 implementation. I'll put up a detailed post of exactly how to do this in a future blog post. The important thing to understand here is that clustering of in-process caches gets you redundancy, but the limit on the size of a single cache is still the limiting factor on scalability (e.g. if the cache you want to cluster has a limit of 500MB of data, clustering the cache between two servers means you are still limited to that 500MB of data in the cache, only now it's stored on two different servers).

Distributed caching differs from clustered caching in that a distributed cache is essentially one gigantic out-of-process cache spread across multiple machines. If you think of a clustered cache as comparable to a clustered application server then a distributed cache is much like a computing grid. Whereas a clustered cache gets you redundancy, a distributed cache gets you horizontal scalability with respect to how much data or how many objects can be put in the cache. Different distributed caching providers handle the exact caching mechanics differently, but the basics remain the same. If you need redundancy in a distributed cache, many distributed caching providers, including Ehcache Server let you cluster distributed cache nodes. The following diagram illustrates how a distributed, out-of-process cache cluster using Ehcache Server might look.

You should note that this is just one of many possible configurations. Using a combination of hardware and software it's possible to build out some pretty sophisticated caching architectures depending on your performance, scalability and redundancy requirements. It's even possible to create hybrid in-process/out-of-process architectures using solutions such as Terracotta.

That's about it for caching architectures. If you want to learn more, a fantastic resource is the website High Scalability. I hope you continue to find this series helpful. In Part 4 we'll cover our last foundation topic - the basics of caching strategies, before moving into ColdFusion 9's specific Ehcache implementation.

Caching Enhancements in ColdFusion 9 - Part 2: Caching Granularity

In Part 1 of this series, I talked about what caching is and why you would want to consider it as part of your application design. In this post, I'm going to spend some time talking about caching granularity. Caching granularity is just a fancy way of saying "what to cache". Before we go further, let's take a look at various caching opportunities you have when architecting an application:

As you can see, there are quite a few places where you can implement caching. For the purposes of our discussion, we're going to focus only on caching at the ColdFusion application server level. We'll take a look at what you can cache within your applications as well as the pros and cons associated with each item. There are 5 basic items to consider for caching at the application server level:

Data - Most ColdFusion developers have cached data at some point or another. In it's simplest form, caching data is nothing more than taking a simple value like a username or some other data type such as a structure or list and sticking it in a shared scope variable in the application, session, client or server scope.

Pros

  • Easy to implement
  • Easy to invalidate individual data elements

Cons

  • Most data still needs to be manipulated before it can be rendered - especially values stored in lists, arrays and structs.

Query Result Sets - Another popular technique familiar to ColdFusion developers is query caching. I don't think I know a ColdFusion developer who doesn't make regular use of this feature. This was one of the earliest caching enhancements made to ColdFusion and it's dead simple to implement. In fact, it's as simple as simple as adding one of two possible attributes to the cfquery tag (cachedwithin or cachedafter). Here's an example that caches query results for 60 minutes:

view plain print about
1<cfquery
2 name="getUsers"
3 datasource="myPeeps"
4 cachedwithin="#createTimeSpan(0,0,60,0)#">

5 select userID
6 from users
7</cfquery>

Pros

  • Simple to implement
  • Will provide performance gain in many cases
  • ColdFusion 8 added support for cfstoredproc and cfqueryparam

Cons

  • No visibility into the cache
  • Difficult to invalidate single cached queries
  • Clearing the entire query cache does it for the entire server
  • Recordsets still need to be processed before being displayed. This can have serious consequences for CPU and memory.
  • Storage of an entire recordset when only partial data will be used
  • Cache miss results in re-execution of the query. Can lead to the "dog-pile effect", which we'll cover in a later post.

It's also possible to cache query results in ColdFusion by assigning the result set of a cfquery operation to a shared scope variable such as a session or application variable. There are also additional pros and cons to using this method:

Pros

  • Allows for more granular control over cached items

Cons

  • Requires programmatic cache management

Objects - Objects in ColdFusion can refer to native CFC based objects or those instantiated through other technologies such as COM, CORBA and Java. Until ColdFusion 9, the only way to natively cache an object in ColdFusion was to place it in a shared scope variable such as an application or session variable. We'll talk about how this changes in ColdFusion 9 in a later post. For now, consider the pros and cons of caching objects.

Pros

  • Objects can represent complex relationships that may be impossible or at the least very expensive to compute at the data tier

Cons

  • Objects may need to be serialized/deserialized depending on the caching mechanism being used.
  • Requires programmatic cache management

Partial Page Content (Fragments) - Caching partial page content is something that's always been possible in ColdFusion but has never been elegant - until ColdFusion 9. Prior to version 9, you could cache part of a page by using the cfsavecontent tag and caching the enclosed content in a shared scope variable such as an application or session variable. There are also several custom tag based solutions that achieved the same thing (always storing in a shared scope variable).

Pros

  • Allows you to cache sections or fragments of content
  • Multiple cached fragments can be used within a single page.
  • Works well in situations where pages are made up of customized content, but the content itself is not necessarily unique

Cons

  • Requires programmatic cache management

Entire Web Pages - The final type of content to consider caching is the entire web page generated by ColdFusion. In terms of pure performance, this is the most desirable item to cache. Realistically, though, it's often impossible to cache entire web pages because of the amount of dynamic content on a a page, or because the page is updated too frequently. Caching entire ColdFusion generated pages goes back pretty far in the language history and has been supported via the cfcache tag. The main issue in versions of ColdFusion prior to ColdFusion 9 has been that the cfcache tag has always cached full pages to disk for server side caching. While this would be ok for static files served up by your web server, disk based caches are relatively slow for application servers when compared to RAM based caches. A secondary issue with cfcache pre-ColdFusion 9 is that there was not fine grained control over the cache making cache management difficult at best. All that changes in ColdFusion 9, of course.

Pros

  • Provides for the fastest performance

Cons

  • Won't work for pages with lots of customized content (see partial page caching)
  • May be problematic if the page content is updated too frequently

Now that we've discussed what you can cache, here are a few additional tips worth considering:

Cache as close to the final state as possible

  • E.g. don't cache a recordset if you'll ultimately use it to build a dropdown box
  • Cache entire pages whenever possible

Cache to static files whenever possible and let your web server serve the files

  • Works well for content that rarely changes
  • For dynamic sites, look to other options

Be mindful of cache size

  • May limit what/how much you can cache

I hope this has given you a good overview of the types of things that can be cached in ColdFusion. The next post in this series will introduce caching architectures.

Caching Enhancements in ColdFusion 9 - Part 1: Why Cache

One ColdFusion 9 feature I haven't heard much buzz about but I think has the potential to really enhance high performance and large scale ColdFusion applications is caching. ColdFusion has always had caching capability, but more often than not they've been black boxed, giving the developer limited control and visibility over the process. All that changes in ColdFusion 9 with a major overhaul of the cfcache tag. The biggest single enhancement here is the implementation of the popular distributed caching provider Ehcache under the covers. What this means is that ColdFusion now implements one of the most popular and certainly one of the fastest caching mechanisms available for Java.

Before I get too deeply into configuration and code, I want to take a little time to talk about caching theory, strategy, and patterns. Ehcache changes the caching game in ColdFusion, and a lot of the knowledge we have as ColdFusion developers about caching is no longer relevant. Some of it in fact is just plain problematic, and I hope to shed some light on those issues and talk about how Ehcache helps solve those problems as well as gotchas to look out for when implementing large caching systems.

Just so that we're all on the same page, let's start with a definition of caching as found on Wikipedia:

"...a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache."

There's two important concepts here. First is that cached data is duplicate data. The second is that we're going to duplicate it where it would otherwise be expensive to compute it or fetch it relative to how quickly it can be grabbed from the cache. Keep these two things in mind as we continue through this post.

When a lot of people talk about caching, they talk about it in terms of performance. You may want to cache a particular web page because it's slow to load, or perhaps you want to cache the stats shown on a particular page because it takes a long time to run the query that crunches the numbers you're going to display. These are both valid cases where using cached data can speed up the performance of your application. What I find to be a more compelling use case, though, is caching for scalability. What I mean by caching for scalability is using cached data to reduce the load on critical resources such as the database, app server, web server, network, or client. At each of these phases there's an opportunity to use cached data to allow you to do more with less. What's really cool here is that a byproduct of caching for scalability tends to be increased application performance.

Let's look at an example involving the database. Say for example your database is capable of handling 100 requests per second. Now what if you need to be able to handle more requests? One option would be to throw more hardware at the problem - increase the amount of memory available to the server, add more processors, or maybe even add a 2nd or 3rd database server to cluster and distribute the load. That's certainly one option, but it's also expensive and potentially complicated to manage. A second option would be to cache the data you're requesting. Let's assume you're able to cache the data such that you achieve a hit ration of 90% (hitRatio = hits/(hits+misses)). That is, 9 out of every 10 requests for data go to the cache instead of to the database (certainly doable in most circumstances). What you've now gone ahead and done is effectively reduced your database load to 10 requests per second. This means that the same database with the addition of a cache is now able to scale by a factor of 10. That's a pretty significant increase in scalability.

That's it for Part 1 of this series. Stay tuned for Part 2 where I'll discuss what to cache and why. If you're planning to be at Adobe MAX 2009, stop by my session on Advanced ColdFusion Caching where I'll be talking about this as well as all of the great new Caching features in ColdFusion 9 in a lot more depth.

More on ColdFusion 9's Virtual File System: Dealing with Directories and Files

In a previous entry, I introduced ColdFusion 9's new RAM based virtual file system (VFS). One question that came up over and over again was how long files stored in the VFS persist. That's a pretty straightforward question. The answer, however, isn't quite as simple. Right now, as of the initial public beta of ColdFusion 9, files saved to the VFS persist until server restart. Obviously this isn't an ideal situation in all cases.

There are a few items you should consider when working with the VFS in ColdFusion 9 when it comes to file and directory persistence as well as security. First, a directory must first be created in the VFS before you can write to it.

view plain print about
1<cfset content="This is a test">
2
3<cfdirectory action="create" directory="ram://myDir/mySubDir">
4<cffile action="write" output="#content#" file="ram://myDir/mySubDir/foo.txt"/>

By default, any directory and any file in the VFS can be read or written to by any .cfm page or CFC. If you need to create a secure VFS environment, you can do so using sandbox security through the ColdFusion Administrator. I won't go into the details here as it's covered in the beta documentation.

The next issue to be aware of is that currently directories and files in the VFS persist until the server is restarted. I'd be surprised if ColdFusion 9 ships this way as I see it as risky to allow anyone on a server (especially a shared server) create as many files/directories in RAM as they want to. Sandboxing prevents people from gaining access to files and directories that they shouldn't have access to but it doesn't prevent them from denying access to system resources by unnecessarily leaving virtual files littered around the server. I don't know how the ColdFusion Engineering team is planning to deal with this, but I would think at a minimum they would provide a server wide setting in the ColdFusion Administrator letting a server admin specify how long files/directories should be allowed to persist in RAM before the server runs a job to delete them. Which brings me to another point - if you're planning to read/write files to the VFS you need to make sure you always verify the existence of a directory or file before you try to read it - especially if Adobe adds a server-wide way for an Admin to specify a timeout for virtual files.

If you want to see what files are currently stored in the VFS on your server, you can use the cfdirectory tag like so:

view plain print about
1<cfdirectory action="list" directory="ram:///" name="myRamFiles" recurse="true">
2
3<cfdump var="#myRamFiles#">

To delete files, you can either use the cffile tag with action="delete" or you can use the cfdirectory tag to wipe out an entire directory worth of files at one time.

I hope this helps clear up some questions that aren't directly answered in the current beta documentation. If you have other questions about the new VFS, let me know.

New in ColdFusion9: Virtual File System

One of the great new features in ColdFusion 9 that I haven't seen much press about is it's Virtual File System. The virtual file system is essentially a RAM disk (remember back to DOS?). This allows you to do three really cool things. First. you can now write files such as images, spreadsheets, etc. to memory instead of disk before serving them back to the browser. Here's an example from the beta docs that shows this in use for writing a JPG file to memory and serving it up:

view plain print about
1<cffile action="readBinary" variable="myImage" file="#ExpandPath('./')#/blue.jpg">
2<cffile action="write" output="#myImage#" file="ram://a.jpg">
3<cfif FileExists("ram://a.jpg")>
4 <cfoutput>a.jpg exists</cfoutput>
5<cfelse>
6 <cfoutput>a.jpg Doesn't exists</cfoutput>
7</cfif>

The second thing this lets you do is write dynamic .cfm files to memory and execute them. Again from the beta docs, to write a file you would do something like this:

view plain print about
1<cffile action="write" output="#cfml#" file="ram://filename.cfm"/>

How you use/execute an in-memory file depends on whether the tag/function you are using requires a relative or absolute path. For tags/functions that require a relative path, you need to first create a mapping for ram:// in the ColdFusion Administrator. Once you've done that, you simple use the mapping in the relative URL. For example if you create a mapping called /inmemory, you would use it within cfinclude like this:

view plain print about
1<cfinclude template="/inmemory/filename.cfm">

For tags/functions that take an absolute path, the syntax is straightforward. From the beta docs:

view plain print about
1<cffile action="append" file="ram://a/b/dynamic.cfm" output="I'm appending">

The third thing you can do with the virtual file system is write and execute CFCs in memory. To write a CFC to the virtual file system you do the following, from the beta docs:

view plain print about
1<cffile action="write" output="#cfcData#" file="ram://filename.cfc"/>

You execute the CFC like so:

view plain print about
1<cfset cfc=CreateObject("component","inmemory.filename")/>

There are some limitations to the ram based file system. First and foremost, you can't write Application.cfm or Application.cfc to memory. Additionally, paths are case-sensitive.

The full list of tags that support the virtual file system are as follows:

  • cfcontent
  • cfdocument
  • cfdump
  • cfexchange
  • cfexecute
  • cffeed
  • cfhttp
  • cfftp
  • cfimage
  • cfloop
  • cfpresentation
  • cfprint
  • cfreport
  • cfzip

Supported file functions:

  • FileIsEOF
  • FileReadBinary
  • Filemove
  • Filecopy
  • FileReadLine
  • FileExists
  • FileOpen
  • FileWriteln
  • FileClose
  • FileRead
  • FileDelete
  • DirectoryExists
  • FileSetLastModified

So, what do you all think? I think this opens up a lot of interesting possibilities, especially in terms of performance improvement.

Adobe ColdFusion 9 and ColdFusion Builder Hit Public Beta

Lots and lots of hard work by the ColdFusion Engineering Team has resulted in today's announcement that ColdFusion 9 (formerly code named Centaur) and the new ColdFusion Builder IDE have hit the public beta milestone and are now available on Adobe Labs:

Also tucked in with the release is the availability of the new ColdFusion Public Bugbase.

If you haven't had a chance to download the public betas, you should really give them a try. ColdFusion 9 focuses heavily on developer productivity enhancements. I'm not going to list everything out here, but let me just say that lots and lots of things people have been clamoring for are included!

Fix for Vista's: The User Profile Service service failed the logon. User profile cannot be loaded

Back in November I finally took the plunge and upgraded my home computer to the 64bit version of Windows Vista Ultimate via a clean install. One thing I did immediately after the install was to move the \users directory from c:\users to f:\users. I did this for two reasons. First, the drive Vista was installed on was only 250GB and I could see running out of room pretty quickly given all of the documents, pictures, videos, etc. I had on the system. The second reason for the move is that I wanted to separate my data from the operating system as much as possible to make upgrades and backups easier to manage.

Unfortunately, there's no easy way in Vista to relocate the \users directory. If you know what you're doing you can change the location during install by using an unattended install, but this can be very complicated to do and is something that's beyond most casual users. In the end I settled on moving all of c:\users over to f:\users and using symbolic links to point from c:\users to f:\users. That way programs could continue to reference c:\users but the operating system would be smart enough to know and forward all requests to f:\users. Following the directions here I was able to move the directories and files and create the required symbolic links. Everything worked well until I got back from vacation last week and my wife tried to login to her account to pay some bills and was greeted by the following error: "The User Profile Service service failed the logon. User profile cannot be loaded." This seemed odd because she had successfully logged into her account only a few weeks ago.

Searching the web for answers turned up this site, which nearly everyone else experiencing the problem linked to.

My problem boiled down to this. I could log in to vista using my (Admin) account and create as many new users as I wanted to via the User Management tools in the control panel. In the User Management tool, I could see each and every one of the new accounts. When I booted up the system or choose to Switch Users, all of the newly created accounts showed up on the log in screen. However, any attempt to log in using any of those accounts resulted in the same "The User Profile Service service failed the logon. User profile cannot be loaded" error.

The recommended solution involved making changes to a specific registry entry that had become corrupted and contained a backup entry. After looking through the recommended solutions, it was obvious to me that my problem was a little different from the majority of users posting to the site. In my case, there was no corrupt registry entry and no backup key to work with. In fact, there were no registry entries for any user accounts other than my working Admin account. I also didn't have a system restore point that went back far enough before I was convinced that the problem had started. From what I could tell, the problem started after an automated Windows update had been applied. The recommendation made to me and others on the forums with the same problem was to reinstall Vista, something I wasn't keen on doing.

At this point, it seemed to me that something must be wrong with the initial creation of a user's profile the first time they log on to Vista. When you create a new user account from the User Manager, Vista doesn't actually create the user's directories until their first log in. When a user logs in for the first time, Vista uses the contents of c:\users\default as a template for the directory/file structure for that user. In the case of the "The User Profile Service service failed the logon. User profile cannot be loaded", I was getting the new user directory (and associated registry entry) was never getting created.

A little more digging through the various Windows log files turned up something interesting. In addition to all of the errors stemming from the user not being able to log in successfully was a warning that a particular filename/extension was to long to be copied. Here it turns out that Vista ran into a problem while trying to copy the default profile during the account creation/log in process. Specifically there are two directories preventing the default profile from being created. The first is:

c:\users\default\AppData\Local\Application Data

As you can see in the following screen shot, the root Application Data folder contains a lot of recursively added \Application Data folders. My best guess is that something went wrong during one of the Windows update processes, resulting in all of the extra recursive \Application Data directories. From the research I've done this doesn't appear to be limited to a single specific Windows update as people have reported the problem as far back as 2007.

Screen1

The second directory you'll need to take a look at is:

C:\users\default\Local Settings\Application Data

Again, if you look in this directory you should find several more levels of \Application Data appended to the top level \Application Data:

Screen2

In both cases, what you'll need to do is to delete all of the additional occurrences of \Application Data below the root level. Once you've done this any user experiencing the "The User Profile Service service failed the logon. User profile cannot be loaded" error should be able to login.

Adobe Releases Photoshop Lightroom 2.1

Looks like the Lightroom 2.1 update has officially been released. For those waiting for it, the 2.1 release includes Adobe Camera Raw 5.1 as well as a long list of bug fixes. For the full list, see the ReadMe (PDF).

Windows Version

Mac Version