How to Invalidate (or Update) Cache Correctly?

Wenbo Zong
3 min readAug 8, 2019

There are only two hard things in computer science: cache invalidation and naming things — Phil Karlton

Cache invalidation/update is often done incorrectly without noticing it. Let’s take a look at the various strategies to update/invalidate cache.

To start off, let’s quickly recap how we populate cache on the read path.

Figure: How to populate cache on the read path

Cache Invalidation — Attempt 1: Update cache after updating database

Figure: Cache update after updating database

This strategy will not work. Consider this scenario: User A is trying to set X to 10, and user B trying to set X to 20. Imagine the actions taking place in the following sequence:

In the end, we have X=20 in the database but X=10 in the cache!

Cache Invalidation — Attempt 2: Delete cache before updating database

Figure: Delete cache before updating database

This strategy is also problematic. Consider this scenario: User A is trying to set X to 20, and User B is reading X. Image the actions taking place in the following sequence:

In the end, we have X=20 in the database, but X=10 in the cache!

Cache Invalidation — Attempt 3: Delete cache AFTER updating database

Figure: Delete cache after updating database

This strategy works most of the time, and it’s used by Facebook (according to a paper published 2013). However, there is still a possibility of inconsistent data. Imagine this sequence of actions:

The inconsistency occurs if there is a long enough delay between the DB read and the cache set by User B. Arguably the probability should be very very small, as the DB write (set X=20 in DB) normally takes longer than the cache set (set X=10 in cache).

Note that, the following sequence of actions does not pose any consistency issue, as the cache is deleted in the end:

Wrapping Up

Though it seems trivial to invalidate cache, it can be tricky. If you know a solution that is practical and does not have the issue of the 3rd solution described above, please leave a comment. The moral of the story is that, all cached items must expire after some time, which limits the ‘damage’ and allows to use a theoretically incorrect but practical solution.

References

https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf

https://www.quora.com/Why-does-Facebook-use-delete-to-remove-the-key-value-pair-in-Memcached-instead-of-updating-the-Memcached-during-write-request-to-the-backend

--

--