Friday, July 06, 2007

.NET Data Caching

Part 1
What is Data Caching?
In simple terms data caching is storing data in memory for quick access. Typically information that is costly to obtain (in terms of performance) is stored in the cache. One of the more common items stored in a cache in a Web application environment is commonly displayed database values; by caching such information, rather than relying on repeated database calls, the demand on the Web server and database server's system resources are decreased and the Web application's scalability increased. As Microsoft eloquently puts it, "Caching is a technique widely used in computing to increase performance by keeping frequently accessed or expensive data in memory. In the context of a Web application, caching is used to retain pages or data across HTTP requests and reuse them without the expense of recreating them."
In classic ASP we didn't have anything nearly as sophisticated nor as powerful as the ASP.NET caching API that is now available to us. With .NET we have the ability to cache whole pages (output caching), parts of pages or server controls (fragment caching) and data caching with the lower-level Cache API.
In this article I will be examining data caching in detail. For more information on output caching see Page Output Caching from the ASP.NET QuickStarts. For more information on fragment caching, see Page Fragment Caching from the ASP.NET QuickStarts.
Applications, Sessions and Cookies
While classic ASP does not have the rich data caching API found in the .NET Framework, it does give us the ability to maintain state and to cache data with the help of session, application and even cookie objects. For starters cookies, which are stored on the Web visitor's computer (disk) don't hold a lot of data. They have a 4k limit and can contain only string information. Additionally, a user may have their browser configured not to accept cookies. For more information on using cookies in classic ASP, see the Cookies FAQs on
Session variables can also be used to cache information in classic ASP, although, as with the cookie approach each session variable is specific to a particular user, and to tie a session variable to a particular user the user's browser must accept cookies. The advantages of using session variable's over cookies is that you can store objects, such as an array, or Dictionary object. Since session variables are stored on the Web server's memory, storing large objects in a user's session on a site with many simultaneous users can lead to reduced memory on the Web server. For more information on session variables see the Session Variables FAQs on
Most often application variables were the means one would use to cache information in classic ASP. Since a given application variable is "global" to the entire Web application, application variables are primarily candidates for caching information that is global across all Web pages on the site. That is, imagine that you ran an eCommerce Web site and that on every page you wanted to list the top 10 selling products. Rather than do a database access on every page, you could cache the results in an application variable. This is an excellent example of when an application variable would be a good use for caching. If you need to cache more user-specific information, such as the 10 most recently purchased items for the user who's visiting the site, you'd likely want to employ the session object or cookies to do this. (For more information on caching database values in an application variable be sure to read : A Real-World Example of Caching Data in the Application Object.)
You can use application variables for caching in ASP.NET much like you did in classic ASP. However, since ASP.NET Web pages can utilize the .NET data cache APIs there's really no reason to ever resort to using application variables for caching in an ASP.NET Web application. In fact, caching data through the data cache API as opposed to through application variables in an ASP.NET Web application has its advantages, including: items in the data cache will be evicted from memory if memory becomes scarce; when adding items to the data cache you can specify how long they should persist in the cache in terms of an absolute or slidign time; and many other advantages, which we will examine in this article.
So to sum up the use of Application, Session and or Cookie objects for caching in a classic ASP page:
  1. Use cookies for small non-critical data.
  2. Use Sessions on a user-to-user basis as in an eCommerce site
  3. And use Application variables for site-wide information that doesn't require constant revision.
If you're interested in data caching in classic ASP, you can find more on that at Caching Data and Learn More About Caching. But we're here to show the incredible power .NET data caching has. While this article will focus on .NET data caching in detail, a good general article on .NET caching basics can be found here at Caching with ASP.NET.
Using Data Caching
The .NET data caching API is comprised of the two classes in the System.Web.Caching namespace. The first class, Cache, is the class we'll be using to add and remove items from the data cache. The second class, CacheDependency, is used when assigning a cache dependency to an item in the data cache (we'll be discussing this in due time).
To add an item to the cache you can simply do:
Cache("key") = value

// In C#
Cache["key"] = value;
Note that the C# version uses brackets instead of parenthesis when referencing the items of the cache in the above manner. In the remainder of the examples in this article I will be using VB.NET syntax, but will point out any major differences between the VB.NET and C# syntax.
The above code adds the item value to the data cache with the key key. The key is used to reference the item at some later point. That is, in another ASP.NET Web page we can extract the value inserted above by using:
value = Cache("key")

- or -

value = Cache.Get("key")
To explicitly remove an item from the data cache you can use the Remove method, specifying the key of the cache item you want removed:
Now that we've examined the simple form for adding, removing, and selecting an item from the data cache, let's take a more in-depth look at adding items to the cache. In Part 2 we'll examine the Cache.Insert method in detail, which can be used to enter cache items with cache dependencies, absolute and sliding time expirations, eviction priority, and callback delegates.
Part 2
In Part 1 we looked at how to add an element to the cache, retrieve an element, and remove an element. In this part we'll look at the Cache.Insert method, which allows for more powerful semantics with inserting an item into the data cache.
Inserting an Item into the Cache
As we saw earlier, inserting an item into the data cache is as simple as saying Cache("key") = value. Items can also be inserted into the cache using the Insert method, which allows for more powerful semantics on how the item in the cache should be handled. Specifically, the Insert method has four overloaded forms, as shown below:
  1. Insert(key as String, value as Object) - Inserts the object value into the cache, giving the item the key name key. This is semantically equivalent to using Cache("key") = value.
  2. Insert(key as String, value as Object, dependencies as CacheDependency) - Inserts the object value into the cache with key name key and dependencies specified by the dependencies parameter. We'll discuss cache dependencies shortly.
  3. Insert(key as String, value as Object, dependencies as CacheDependency, absoluteExpiration as DateTime, slidingExpiration as TimeSpan) - Inserts the object value into the cache with key name key, dependencies dependencies, and (time-based) expiration policies. Expiration policies, as we'll discuss soon, specify when the item should be evicted from the cache.
  4. Insert(key as String, value as Object, dependencies as CacheDependency, absoluteExpiration as DateTime, slidingExpiration as TimeSpan, priority as CacheItemPriority, onRemoveCallBack as CacheItemRemovedCallback) - Inserts the object value into the cache with key name key, dependencies dependencies, (time-based) expiration policies, a cache priority, and a callback delegate. The priority specifies how important it is for the cache item to remain in the cache. That is, items with a lower priority will be evicted from the cache before items with a higher priority. The callback delegate provides a means for you to create your own function that is automatically called when the item is evicted from the cache.
The above list of the various forms of the Insert method may look quite daunting. Most often you'll likely use either form 1, 2, or 3. Let's take a moment to discuss the CacheDependency and absolute and sliding time parameters.
Recall that information stored in the data cache is being stored on the Web server's memory. In a perfect world, when an item is added to the cache using Insert(key, value) or Cache("key") = value, the item will remain in the cache forever. Unfortunately this is not plausible in the real world. If the computer the Web server runs on is rebooted, or shuts off, for example, the cache will be lost. Even if the Web server machine is running, you may lose items from the cache.
To see why, imagine that your Web server has allocated one MB of memory for storing items in the data cache. Now, imaging that you've added a number of items to the data cache such that you've used exactly 1 MB of memory. Great. Now, what happens when you add another item to the data cache? In order for the new item to "fit," the data cache needs to make room for it by removing an existing item. The existing item that is chosen to be removed is said to be evicted.
There may be times when you don't want an item to exist in the cache indefinitely. For example, say that you were displaying an XML file in an ASP.NET DataGrid. (See XML, the DataSet, and a DataGrid for an article illustrating how to accomplish this!) Rather than load the XML file into a DataSet and bind the DataSet to the DataGrid each page view, you may opt to cache the DataSet in the data cache. This option would work great until the XML file was altered; at that point, if you were still displaying the cached DataSet the user would be seeing stale information.
To overcome this problem, you can add the DataSet to the data cache, but specify that the XML file it represents is a cache dependency. By setting this file as a cache dependency, when the file changes the DataSet will be automatically evicted from the cache. That means the next time the DataSet is attempted to be read from the cache, it will not be found (since it has been evicted) and will be recreated by repopulating the DataSet from the XML file. This is desired since the XML file has changed since the DataSet was last cached. In order to insert an item with a cache dependency, you can do:
Cache.Insert("key", myDataSet, New CacheDependency(Server.MapPath("data.xml")))
If you wish to have the cache item evicted from the cache in an absolute time, say, five minutes from when it was inserted into the cache, you can use the third form of the Insert method, whose fourth parameter expects a DateTime value specifying the absolute time. The following code illustrates how to add an item to the cache that will expire five minutes from when it was added and has no cache dependencies:
Cache.Insert("key", value, Nothing, DateTime.Now.AddMinutes(5), TimeSpan.Zero)
In C# you would use null instead of Nothing to signify that you do not want a cache dependency.
Note that since we do not want to specify a sliding time expiration, we set the last parameter to TimeSpan.Zero. Whereas an absolute time specifies that the item should be evicted from the cache at a specific time, the sliding time eviction parameter specifies that the cache item should be evicted if it is not referenced in a certain timespan. That is, if we set the timespan parameter to, say, TimeSpan.FromSeconds(30), the cache item will be evicted if it is not referenced within 30 seconds. If it is referenced within 30 seconds, it will be evicted if it's not referenced in another 30 seconds from when it was last referenced, and so on. An example of this would be:
Cache.Insert("key", value, Nothing, DateTime.Now, TimeSpan.FromSeconds(30), TimeSpan.Zero)
Note that when using the sliding time expiration parameter, the absolute expiration parameter value does not matter. That is, it is automatically set to DateTime.Now and has the sliding time added to it to determine the absolute time the cache item should expire. Of course, if the item is referenced within that time period, the calculation is redone and the absolute expiration time is reset.
Before we move on to some examples, let's take a quick look at the CacheItemRemovedCallback delegate. Recall that you can set this in the fourth overloaded form of the Insert method. The CacheItemRemovedCallback specifies a function that is called when the item has been evicted from the cache. To use the CacheItemRemovedCallback you need to first create a function that has the definition:
Sub CallbackFunction(String, Object, CacheItemRemovedReason)
The CacheItemRemovedReason is an enumeration that explains why the item was removed from the cache. Its entries include:
  1. DependencyChanged - the item was removed because its cache dependency was changed.
  2. Expired - the item was removed because it expired (either by absolute or sliding time expiration).
  3. Removed - the item was explicitly removed via the Remove method.
  4. Underused - the item was evicted by the cache because the system needed to free up memory.
To add a CacheItemRemovedCallback function to an added cache item you will need to create the appropriate function and a delegate variable that is wired up to the function, as shown below:
'You would use onCacheRemove as the input parameter for the
'CacheItemRemovedCallback parameter in the fourth form of the Insert method
Dim onCacheRemove As CacheItemRemovedCallback
OnCacheRemove = New CacheItemRemoved(AddressOf Me.CheckCallback)

' Now, create the function
Sub CheckCallback (str As String, obj As Object, reason As CacheItemRemovedReason)
Response.Write ("Cache was removed because : " & reason)
End Sub
I am not going to go into detail discussing the CacheItemPriority option, since I've found very few real-world uses for this property (in designing Web applications). To learn more about this enumeration and how to use it with the Insert see the technical documentation.
At this point we've discussed the basics of data caching and the specifics of the Cache.Insert method. In Part 3 we'll examine a real-world caching example: a pagable DataGrid that uses a cached DataSet to save on database hits.
Part 3
In Part 2 we looked at the Cache.Insert method extensively. In this third and final part we'll examine a real-world application that utilizes data caching to provide increased performance for a pagable DataGrid.
A Real-World Example
Over a year ago I had written another article here on 4Guys discussing a method of paging a .NET datagrid with exact count - Custom ASP.NET Datagrid Paging With Exact Count. I went on to show how you can cleverly page a datagrid and always show how many rows are coming up, like "Next 5 >" and right before the last page you'll see "Next 1 >", for instance, assuming there would be just one record to be displayed on the last page.
Now this was cool, and the emails received attested this fact that people enjoyed it and found it useful. Nevertheless, as Scott Mitchell pointed out, the DataGrid by default has one particular flaw. This flaw in question is when displaying your DataGrid result set with say 5,000 records, upon each consecutive paging action, the database get hit again, and pulls in all 5,000 records in again! This is an obvious performance issue that does affect the application and can greatly diminish scalability.
There are other means to remedy this. One way is creating a stored procedure that returns only the pertinent records, as discussed at: Paging through Records using a Stored Procedure. I have used this method on occasion prior to utilizing .NET caching - it works well but the amount of code you need to write is quite substansive. Additionally, even with the stored procedure approach the database must be hit each time the user pages through the data.
By caching the result set in the data cache you can allow the user to page through the cached data, thereby not enduring any database hits except for the first time the item is loaded into the cache (and any other time it gets evicted and needs to be reinserted into the cache). The only potential downside is that the cached data may become stale over time as the underlying database data changes. With some clever programming, though, you can set up your database inserts, updates, and deletes such that when new data is added to the underlying database the cached data is invalidated. See Invalidating an ASP.NET Web Application Cache Item from SQL Server for more details.
The caching for my example resides solely in the BindMyDataGrid() method, which is responsible for binding the data to the DataGrid. The first line of code in this subroutine grabs the cached DataSet from the data cache, as can be seen below:
Sub BindMyDataGrid()
'Programmatic Caching Setup
Dim DataGridCache As DataSet = CType(Cache.Get("DataGridCache"),DataSet)

Of course the DataSet might not exist in the cache. We might not have added it, or it may have expired or been evicted. Hence, before doing anything else we must check to see if DataGridCache is equal to Nothing. If it is, then we must populate our DataSet from the database and store it back in the cache. If it is not, then we can just proceed to the code that binds the DataSet to the DataGrid.
  'Continued from previous code block...

If DataGridCache is Nothing Then
'Populate the DataSet with data from the database
Const CommandText As String = _

'The connection to our database
Dim myConnection as New _

Dim myCommand As New SqlDataAdapter(CommandText, myConnection)
Dim DS As New DataSet()


'Specify the DataSource for the DataGrid is the DataSet
'we just populated
MyDataGrid.DataSource = DS

'Now insert dataset into cache, specifying that it should
'expire in 10 minutes
Cache.Insert ("DataGridCache", DS, Nothing, _
DateTime.Now.AddMinutes(10), TimeSpan.Zero)
lblCacheInfo.text = "DataGrid was populated from the database..."

'Specify what time the cache was updated
Application("TimeCachedDataSetAdded") = DateTime.Now

'Determine how many total records we have
RcdCount = CInt(DS.Tables(0).Rows.Count.ToString())
'The DataSet is in the cache.
lblCacheInfo.text = "DataGrid was used from cache. The cache " & _
was populated at " & _

'Populate datagrid from cache.
MyDataGrid.DataSource = DataGridCache

'Calculate the total # of records
RcdCount = CInt(DataGridCache.Tables(0).Rows.Count.ToString())
End If

'Bind the datagrid from either source

ShowStats() 'Displays what page we're on and such
End Sub
Other than the BindMyDataGrid() method shown above the only other important part of the ASP.NET Web page is the HTML section, which sets certain DataGrid properties to allow for paging.
<asp:Label id="lblCacheInfo" runat="server" />


<form runat="server">
<asp:datagrid id="MyDataGrid" runat="server" Font-Bold="True"
AllowPaging="True" PageSize="10"
[View a Live Demo!]
The more important properties of the DataGrid Web control have been bolded. Be sure to check out the live demo to see how the pieces fit together. For more information on the DataGrid be sure to read: An Extensive Examination of the DataGrid Web Control.
In this article we examined the awesome power and useful features of the .NET caching API. Specifically, in this article we looked at a real-world example where the data for a pagable DataGrid was cached in the data cache, saving round trips to the database each time the user steps through a page of data.
Until next time, happy programming!

Now that's room service! Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel
to find your fit.