
Coding in the Cloud – Rule 1 – Cache is Your Friend
// June 25th, 2009 // Development
Coding in the Cloud
By Adrian Otto
This post kicks off a series on rules for coding in the cloud. I’ve taken some notes – mainly from observing applications fail repeatedly. And what do I mean by failing? The trouble usually comes in one of three forms:
o Not scaling as traffic comes in.
o Site fails to function under high load, resulting in complete failure.
o Getting a huge, unexpected bill for overages.
These three nasty things can happen on a cloud if your site is not coded properly. I’ve thought about the things that lead to these tragedies and looked at some recommendations around what not to do and how to write web applications so you don’t end up in those situations.
So what is caching? Caching is saving some reusable entity that you plan on using over and over again.
So let’s say that you make a blog post. That blog post has a page, and when you link to that page it needs to run 50 database queries and generate a whole bunch of PHP/HTML applets, which then get displayed to the browser. The whole process might take three seconds. But if you only do that the first time, and then upon subsequent requests you show a cached version of that same output, you can cut those three seconds down to 10 milliseconds to read the cached file and output it to the user. That’s the most resource-saving practice that you can do on Cloud Sites—not repeating the same work over and over again.
Don’t store the cache in a database
Often people attempt caching, but instead of writing the cache to a file they write it to a database.
Saving cache in a database is a bad idea.
It’s trading one problem for another problem, saving one database query and creating another one. Caching in a database creates a scenario with lots of reads and lots of writes, and that comes with a high penalty, obviating the benefit of putting in a cache to begin with.
Use a memory-based or file-based cache
My recommendation is to use a file-based cache. For a good example of a file-based cache, look at WP Super Cache, a module for the Wordpress CMS. It uses a file-based cache in coordination with a mod_rewrite that allows you to bypass the PHP code entirely in order to serve pre-generated content. We see Wordpress blogs getting a 10x improvement in resource utilization on their Cloud Sites deployment when they employ WP Super Cache by simply turning it on, and without changing a single line of code. Every time somebody makes a blog post, it generates the new corresponding page, and saves it in a file based cache. All public visitors to the site see pre-generated HTML instead of generating it on the fly.
Implement event-driven updates
Updates to the cache should be event-driven rather than updating the cache every 15 minutes or every hour. It’s generally a bad practice to automatically update the cache after a period of time. It’s a much better practice to update the cache when data changes, such as when someone comments on a blog post.
Use a nonblocking design
When you implement a file-based cache, it’s tempting to use a blocking design where you lock the file in order to serialize access to read the files and to write them. The reason for using a blocking design is to make sure one person doesn’t read while you’re writing and get partial output. But when you serialize your access and you are making an update, all of your readers are also serializing, and this causes a huge backup of pending requests. It’s smarter to write your changes to a temporary file, which you then rename. If you’re serving cache.html and you want to update it, you open a new file and write it as cache.html.tmp. Then rename it cache.html. Since renaming is an atomic function, you’re never going to see a partial file, and it won’t interfere with active readers of the file. You also don’t require any locking because you’re the only writer to that temp file. So by eliminating locking you eliminate all the serializations in file access, and you have no upstream system blockage at all. Recent versions of WP Super Cache work this way.
It is important to realize that blocking, especially in Cloud Sites, costs cash – the money kind of cash. It means every single person that’s waiting idly for something to happen, is simultaneously consuming a scarce resource and holding onto resources – and you’re getting the bill. Since Cloud Sites monitors compute cycles, you want to make sure that all those cycles are doing productive work for you, not just serializing access to files unnecessarily.
So remember: cache is your friend.
Stay tuned for the next post and you’ll hear about database writes and how they impact performance in the cloud. Subscribe to this blog to see when the next blog comes out.





















Joomla and Drupal also benefit greatly from caching. In my experience managing a very large Joomla website, we had terrible performance until we decided to turn on both Joomla caching and static file caching (for CSS/JavaScript files). If you make a policy to only use components and modules that are both efficient and cacheable, you’ll be able to scale up no sweat.
Great post!
There is another important practice to mention which may be what you mean by ‘event-driven updates’- write-through caches.
Hash-table databases like CouchDB or even SimpleDB may actually serve as good caches. I don’t know- I’ve never tried it.
Typically caches are most effective and much easier to implement in the common scenario of lots of reads and few writes. With a write-through cache, you make sure that everyone is looking at the most current data, and you don’t shoot yourself in the foot when it’s time to scale.
Also, ‘writing to a database’ probably assumes RDBMS.
,Wil
[...] Coding in the Cloud – Rule 1 – Cache Is Your Friend: Adrian Otto has some excellent tips on using cache to dramatically speed up your web application. [...]
Wil Sinclair,
Good point. I did not describe the difference between a write-back and write-through cache.
A write-through cache is where each time you add/update an item, you replace the cached item and the corresponding item in the backing store in a synchronous fashion. This is way, you always have a view to the most current data. The update takes the performance penalty for the data consistency, so this works well in a use pattern where writes are infrequent (most web apps are this way).
A write-back cache is where you simply update the cache, and have some policy by which the backing store gets subsequently updated in an asynchronous fashion. If your cache is memory based, this allows you to do a high speed write to the cache, and optimize your writes to the backing store using a batch method tuned for your setup. This approach can be better than a write-through cache in cases there your use pattern is high write.
The key advantage of a write-back cache is that cache updates are very fast, and hot items in the cache are not demand fetched from the backing store upon access. A drawback happens in failure cases where you lose the cache due to a failure, and have not yet persisted new data to the backing store. It’s also possible to get into a tangle where the number of pending updates in the cache exceeds your ability to persist them to the backing store, which can lead to data loss. Memory allows writes that are hundreds or thousands of times faster than writing to disk so if you have a lot of updates it’s not hard to get stuck in this way, and need a mitigation strategy for it. You need to know what the update rate limit is on your backing store, and make sure that your update rate on the cache does not consistently exceed it.
In web apps that use an SQL database as a backing store along with a cache, the arrangement is almost always an inverted write-though. The app updates the database, and purges the affected item(s) from the cache. The cache replacement is typically demand based rather than proactive, meaning that the next request for that data re-populates the cache. Performance can be gained by updating the cache proactively replace it with fresh data.
You mentioned using a distributed database as a cache. That could also work depending on the use pattern of the application. The trick is finding a good way to avoid hot spots in the distributed database. If you don’t solve that issue, you potentially end up with about the same problem that a centralized RDBMS has.
With caching you want a bunch of things all at once:
1) High Performance (fast) access to the data
2) Maximum efficiency accessing the data
3) Adequate server capacity to address the request workload
(among others)
When you begin to distribute a cache, you generally increase #3 at the expense of #2 and sometimes also at the expense of #1. Where possible it’s better to have a single cache that’s fast enough and has enough capacity to handle your entire workload. But that’s not cloud-like. So in a cloud environment you typically end up with multiple caches that either contain a subset of the cache data, or replicas of it. When your data is distributed among multiple caches, you risk having hot spots in the cache. When it’s replicated, you risk wasting resources storing the same cached data in multiple locations.
The bottom line is that caching is a complicated subject. It’s complicated because there is no single golden solution that works best for all use cases. It all depends on the nature of the data you are storing, and how it’s accessed by your application and its users.
Adrian
The best information i have found exactly here. Keep going Thank you
[...] continues my series on Rules for Coding in the Cloud, rules I’ve developed after watching applications encounter problems at scale when deployed [...]
[...] continues my series on Rules for Coding in the Cloud, rules I’ve developed after watching applications encounter problems at scale when deployed [...]