Optimizing Plone performance with caching
NOTE: This document is something of a brain-dump. I’m still experimenting with various caching techniques, and would like to organize it somewhat better, so it’s currently a work in progress.
I’ve long been a big fan of the Plone CMS, which powers this very Web site. Any ambivalent feelings I may have had about it have generally derived from its sometimes brutal tendency to hog resources. Throw in some Zope- and ZWiki-related memory leaks while you’re trying to run this thing on a machine with only 400 MB of RAM and you’re going to run into problems (this very scenario was the source of major headaches for me over the last year).
But Plone is a powerful, elegantly-designed, object-oriented system. These attributes don’t come for free–they can require heavy-duty hardware resources. When you don’t necessarily have them, the trick is to find a workaround.
In this case, that workaround is server-side caching. That’s something I’ve known for a long time, but for whatever reason, its implementation remained more or less a mystery to me despite having read up on it in a few different sources, The Definitive Guide to Plone among them. It wasn’t until I created this site, refamiliarizing myself with Plone, and then began to reinvestigate the subject that it finally clicked in a way that made me feel silly for not being able to solve this problem in short order on the sites on that poor, memory-challenged server. Revisiting the Definitive Guide and reading through a tutorial on plone.org got me to where I wanted to be. One of the fundamental realizations I made was that the only thing the Plone HTTP Cache Manager really does is apply a cache-related headers to server reponses so that Apache, if in fact it’s being used, can do something with them. It doesn’t do any caching itself at all, unlike the RAM Cache manager (more on that below).
So here’s what I did:
I’m running Plone 2.1, fronted by Apache 2.0 on the server. I knew that what I needed to do to cut back on unnecessary performance hits was to keep requests from ever getting to the Zope server to begin with–i.e., those requests would need to grab a copy of the resource from a cache. For most sites, this one included, anonymous users of the site don’t really need to see brand spanking-new content more than once per hour, so the idea would be that each resource on a site is cached for up to an hour: the first user to grab a page at, say, 5:45 PM will get it fresh from the Zope server, while any subsequent user will get a static copy of that content from the caching server, rather than from the Zope DB, until 6:45 PM, at which point it may have changed, and a new copy will be retrieved from Zope upon the next request for it.
(Incidentally, here’s another consideration to keep in mind: while a listing of recent content might change a few times per day on this site, an individual document will almost never change. If you have the ability to distinguish between these types of pages for caching purposes, it may be worthwhile to do so–i.e., you could expire the content listing every hour, but expire an individual content page once per day.)
Anyway, I think part of my previous problem involved the notion that I had that any kind of caching would necessarily have to involve learning yet another major server-side application, the Squid proxy server. But as it turns out both Apache 1.3 and Apache 2.0 include proxy server caching capabilities. With mod_cache, and the new modules mod_disk_cache and mod_memory_cache, Apache 2.0 can either manage cached items on disk or in memory. Because memory seems to be the resource that’s in short supply in most server environments I find myself in, I stick with mod_disk_cache.
To enable caching on the Apache side, what I ended up with was an Apache entry that looks something like this:
<VirtualHost *> ServerName www.mysite.com RewriteEngine On RewriteRule ^/(.*) http://localhost:8080/VirtualHostBase /http/mysite.com:80/mdorn/VirtualHostRoot/$1 [NC,P,L] CacheRoot "/var/cache/apache2/proxy" CacheEnable disk / CacheSize 10000 CacheDefaultExpire 1 </VirtualHost>
Essentially I’m identifying and enabling a cache directory that will accommodate 10 MB of items. I believe that CacheDefaultExpire 1 applies a one hour expiration policy to any item that has been signaled for caching, but without specifying a time period. That’s probably irrelevant for Plone caching. There are several other related directives that I don’t include here because after reading up a bit in the Apache docs, the default values seem fine for my purposes.
Once you’ve got Apache running with these new directives, go into your default HTTP Cache Manager in your Plone site. Now, for anything that you want to cache that’s in the ZODB (e.g., in the "custom" directory of portal_skins) you’ll need to undertake the tedious process of associating those items with the cache object in the "Associate" tab. But if the items that you want to cache are filesystem objects (and ideally they will be), you’ll need to create a ".metadata" file associating them with your HTTPCache object. For example. if you have a page template called "my_template.zpt", you’d create a file in the same directory called "my_template.zpt.metadata" with the following contents:
[default] title = My Page Template cache = HTTPCache
You can "associate" them in the ZMI, but your settings are lost when Zope is restarted.
It’s also worth noting here that there are probably some objects that you want to cache always. These would probably include images, JavaScript and CSS. By default, many stock Plone objects of these types have metadata files to do just that, but it may make sense simply to add the following directives to your Apache entry to avoid having to manage this in Plone at all:
ExpiresActive On ExpiresByType image/gif A3600 ExpiresByType image/png A3600 ExpiresByType image/jpeg A3600 ExpiresByType text/css A3600 ExpiresByType text/javascript A3600 ExpiresByType application/x-javascript A3600
By default, HTTPCache is set to expire all your cacheable items 3600 seconds (i.e., one hour) after first access. As I mention above, it would be nice to be able to distinguish between content types and vary that number accordingly. By creating a caching policy (as per page 452 of the Definitive Guide), you can apparently do exactly that, as well as make other kinds of distinctions using any Python expression, but I think I will start with a global expiration time, of any content that I happen to be caching, of one hour. (Note: I couldn’t get this to work properly by following The Definitive Guide Example anyway.)
In order to test that things were working, before implementing caching, I ran Apache benchmark (ab) like so:
/usr/sbin/ab -n 100 http://mattdorn/
which sent 100 requests to the server–each one went to get the content from Zope, and it took nearly a minute. Afterward, I associated the main template that’s used by my home page, which happens to be atct_topic_view, with the HTTP cache. Once I thus had caching implemented, the same benchmark was nearly instantaneous, after the first page access, as you’d probably expect, given that it’s just Apache grabbing some static files out of a cache directory at this point.
I also changed the interval to 30 seconds, and issued requests within and after that time period while tailing $INSTANCE/log/Z2.log to see whether requests were being received by Zope appropriately (i.e., only after the interval had expired). It, too, worked as expected.
So what if your site is a little more dynamic than this one, and you want to be able to, say, keep one of the portlets updated in real time, while you’re fine with the content in the main column only being updated every hour? Well, that requires a somewhat different strategy that involves using the RAM cache (which has nothing to do with HTTP headers at all), and probably some reprogramming of your site. Additionally, you won’t get nearly the performance advantages you realize by the global HTTP caching strategy. The Definitive Guide provides a very good example of RAM Cache use by pointing to the author’s site, ZopeZen.org. On the home page of that site, a Python script generates the Web log entries, with a tally of comments for each entry. That’s a quite expensive call, as it turns out. The site’s owner decides he can live with this listing only being updated every 30 minutes, and so he assigns the script that produces this information to the RAM cache, achieving a five-fold performance advantage. I considered trying this method by assigning my "folder_contents_custom" template, which controls the portion of the page that displays my Web log entries (i.e., it’s called by atct_topic_view), but as far as I can tell, only Python scripts and External Methods can be used with a RAM cache. I suppose I could reprogram my site to do all of the heavy lifting within a Python script rather than within the ZPT if I really needed this.
One final thing to note is that by default, Plone’s HTTP Cache object doesn’t send cache headers when you’re authenticated. That’s a good thing, because when you’re editing content, you implicitly need server responses to be fully dynamic. So content editing sessions will be the only time that your server is put under the standard stress that a Plone site is capable of inflicting.
In retrospect, my discoveries and conclusions here are consistent with a report that I myself commissioned, but in order for me to really understand what was going on, I had to have a real-life scenario to play with. Guess that’s not too surprising.
Unfortunately, the caching procedures described here have not been implemented on this site as of this writing, because I do not adminster the box that it runs on. I’m sure I’ll get it done in short order.
Tags: mattdorn.com, plone, web development