Stale Data In Cached Pages

We have heard from many website site owners who have a website that was designed with fresh, live data in mind who are finding that their pages are consistently showing stale, old data.

Nines times out of ten this is because a caching strategy was not considered at the outset of the project. The result is that the website got built and everything looked fine so it was deployed to the production environment. Most likely somebody observed that was the site was a little slugginsh and so, after some Googling, it was determined that the appropriate fix was to enable page caching via the administration pages in Drupal.

The consequence is that now the pages are cached, all of those live statistics such as page view counts, recent Tweets, latest news, new articles etc are now not updating until the caches get flushed.

You may now find that you are constantly having to keep flushing your cache. Not only does this task quickly become a rather bothersome chore, but it also adds to the work load for your server, since your are flushing all caches, not just the ones where new data is expected.

What you really need is a caching strategy.

Comments

The expire module is pretty good at handling some parts of what this article is talking about - https://www.drupal.org/project/expire

The other part takes a lot of work, but the HTTP Parallel Request & Threading Library https://www.drupal.org/project/httprl/ has a nice wrapper function that was recently added to the dev branch; it allows for the caching of any function with the added advantage of it being updated in the background. So it will serve the cached version but re-generate a new version of the data in the background; user gets a responsive page (due to cached output from the function call) and the data is continuously updated in the background. Caching without the downsides in short; httprl_call_user_func_array_cache() is the function to checkout.

Some more great input here from mikeytown2.

We had some good fun with the expire module recently and we find it invaluable on some of the larger sites we operate. On one particular site there are several roles defined and each role gets access to different content within pages so we are using authcache to handle the page caching.

After a while we started to notice that the caches weren't getting flushed as we would expect them to according to the rules defined with the expire module. It turned out that the number of roles was causing a huge amount of computational effort due to the way the authcache module has to permutate all possible role combinations, consequently an out of memory exception was being thrown meaning the process to expire cache entries could not complete. Fortunately authcache provides hooks in the right places so that you can be a little more explicit about the role permutations that are necessary and now everything is running nicely.

We have found https://www.drupal.org/project/httprl/ to be great. I am heading off to check out httprl_call_user_func_array_cache()  right now. The ability to schedule the re-generation of complex content blocks in the background could be huge for us.