Consistency Problems in a Microservice Architecture (Part II)

Wenbo Zong
4 min readJul 10, 2019

--

This post is part 2 of a three-part series on the consistency problems arising in a microservice architecture. We’ll talk about how to achieve consistency in a “workflow” scenario.

Workflow Consistency in EDA

We can view a distributed workflow as a series of asynchronous change updates, and it maps directly to the Event Notification architecture outlined in Martin Fowler’s talk, because a workflow, in our definition, is one-way traffic.

The key in implementing asynchronous change update is that the local data update and the sending of the event must be atomic, otherwise we may lose events which eventually leads to inconsistency. The simple trick to achieve this atomicity is to record the event in the same local transaction as the data update, and send the event only after the local transaction succeeds.

Using a Local Transaction to Guarantee At-least-once Delivery

Imagine a very simple account management system which consists of an Account service (create/delete/update account, etc) and a Notification service (e.g. sending email). Suppose the business rule says that an email notification needs to be sent upon successful registration of a new user.

The normal workflow is illustrated in the following diagram:

  1. Account service creates an account upon user request
  2. Account service sends an event to a message queue to notify the Notification service, after which it returns the result to the user
  3. Notification service picks up the event and sends the welcome email

Which step can fail?

  • Step 1 can fail, but it won’t result in any inconsistency because the whole workflow simply aborts.
  • If step 2 or step 3 fails, the new user will not receive the welcome email. However we do not want to rollback step 1, as it does not make sense. So there is an inconsistency.

How to fix?

The overall design is as follows:

Using a location transaction to ensure workflow consistency
  • To fix step 2 failure, we can record the event in a local table before sending to the MQ. But this must be done as part of a local transaction together with the account creation. The initial status of the event is “pending”. The Account service will then send the event to the MQ and update the status to “sent”.
  • To counter the sending failure, we need a cron job to periodically scan the event table, send (or re-send) the events, and update the event’s status. Note there’s chance of sending the same event multiple times, i.e. duplicate events, for example when the cron job is restarted right after sending out the event but before updating the status to “sent”.
  • How to fix step 3 failure depends on the business requirement. Suppose we can tolerate duplicate welcome emails, which is a low probability thing anyway, the Notification service can simply delay the ack to the MQ until the email is sent successfully.

If duplicate emails are not allowed, the Notification service must ensure idempotency, which we will discuss next.

Idempotency

Idempotency is an important concept and can greatly simplify the implementation of workflows/transactions. Basically idempotency means there is no side effect when the same operation is applied multiple times. An example of idempotent operation is HTTP PUT, because applying a PUT operation always lead to the same result. On the other hand, HTTP POST is not idempotent as it creates a new object each time it is applied.

Idempotency is important as it frees the higher level application from handling duplicate messages and instead the application can always pass the messages to the service. However, achieving idempotency within a service is not trivial at all. Most likely, we will need a global transaction identifier for each transaction.

Now let’s come back to the earlier question: what if duplicate emails are not allowed? The answer is that the Notification service must ensure idempotency:

  1. Get the event from the MQ and save it to a local database, mark the status as “pending”
  2. Send ack to the MQ
  3. Enclose the email sending and status update into a transaction.

Unfortunately, idempotency is actually not possible as email sending is an external service. But we can talk about how to achieve idempotency in general. For example, suppose instead of sending a welcome email, we want to put some initial credit into the user’s account. The update of the credit and the event processing status can then be protected by a local transaction, in the same manner as the Account service.

To sum up, for workflow consistency,

  • The event producer uses a local transaction to update the data and record the events to be sent
  • The event producer sends the events to the MQ until success (there could be duplicates)
  • The consumer must consume the events in an idempotent manner (with the help of a global transaction identifier)

Coming up, we’ll talk about distributed transactions, finally.

Part I: https://medium.com/@zongwb/distributed-transactions-in-a-microservice-architecture-271ec1cb235

--

--

No responses yet