Eventual Consistency for Cache Update with Read-Write Splitting

Wenbo Zong
4 min readNov 13, 2019

We all know that cache update/invalidation is hard to do correctly. Things can become particularly interesting when you set up the DB for read-write splitting. In this post, we will talk about how to achieve eventual consistency for cache update with read-write splitting.

Background

First, let’s review the Cache Aside Pattern. A more detailed description can be found in my previous post here.

  • The read path
Cache aside pattern: read path
  • The write path (cache update/invalidation)
Cache update

When we set up read-write splitting for our MySQL database, it looks like this:

The problem

Now, when caching is combined with read-write splitting, something interesting happens, as illustrated below:

Problem: Cache update with read-write split

The sequence of actions are:

  1. Client A updates the data in the master DB
  2. Client B tries to read the data, but it’s not found in the cache
  3. Client B loads the data from the slave DB
  4. Client A deletes the data from cache (not present anyway)
  5. Data is synced from the master to the slave DB (background job)
  6. Client B writes the data to the cache (as it was a cache miss previously), and the data is stale (different from the master copy).

As you can see, if things happen in the above sequence, we end up having inconsistent data in the DB and cache!

The solution

The root cause for the data inconsistency is the delay in syncing the data from the master to the slave DB. To fix it, we should delete (or update) the cache after the slave DB is updated. We can parse the binlog of the slave DB(not master!) to capture the update event, then delete/update the cache accordingly, as illustrated below:

Solution: Cache update with read-write split

The sequence of actions are:

  1. Client A updates the data in the master DB
  2. Client B tries to read the data, but it’s not found in the cache
  3. Client B loads the data from the slave DB
  4. Client B writes the data to the cache (as it was a cache miss previously), and the data is stale (different from the master copy)
  5. Data is synced from the master to the slave DB (background job)
  6. The binlog parser captures the update event of the slave DB and sends the event to an MQ; The update logic, which subscribes to the MQ, picks up the event and deletes the (stale) data in the cache. The cached data is eventually consistent!

There is one edge case, though: If the time lag between step 4 and 3 is very long, step 6 may happen before step 4. Arguably, it takes much longer for the update event to go through the cache update pipeline and the probability of step 6 happening before step 4 should be sufficiently low in practice.

PS: For the binlog parsers, there are a few opensource implementations such as Canal (Alibaba) and Debezium (Redhat).

As a side note, the cache update pipeline in the above diagram can be actually generalised to a powerful data event framework, which is able to capture all sorts of update events and process them for different purposes.

Generic data event processing framework

Wrap up

In this post, I tried to describe the tricky situation of updating/invalidating cache with MySQL set up for read-write splitting, and how to achieve eventual consistency by parsing the binlog (of the slave DB) to capture the update event.

This is yet another example of designing for eventual consistency. Thanks for reading!

--

--