Idempotency: A Three-Step Approach
Idempotency is a much desired property when designing applications, especially distributed applications because they are more prone to failures and error handling (e.g. retry) is often essential. In this essay I try to present a somewhat systematic approach to implementing idempotency.
What is idempotency?
Let’s begin with refreshing ourselves with the definition of idempotency (from wikipedia):
Idempotence is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application.
Before we proceed, consider two trivial questions.
(1) Is a read operation idempotent? For example,
- First read, X=10
- Second read, X=20
(2) How about HTTP Delete? For example,
- First delete, status=200
- Second delete, status=404
The exact response of the second operation in each example is different, but these operations are actually deemed idempotent because they do not change the state of the resources. Hence, in the context of software, we may define idempotency as no side effects.
Idempotency != Safety
It is also worth noting that idempotency is different from safety. In the context of HTTP APIs, a HTTP method is idempotent if it guarantees that repeating a request multiple times has the same effect as issuing the request once. A quick summary of the HTTP methods is shown in the following table.
Why idempotency is important?
To illustrate why idempotency is a desired property for applications, let’s look at a request-response style RPC service and a messaging-based service, respectively.
Why Is Idempotency Important for RPC?
Consider the following API call to deduct money from an account.
If the caller does not receive a response, the caller cannot know the status of the request, because there are a few possibilities:
+------------------+----------------------------------------+
| Request status | Reason for not receiving the response |
+------------------+----------------------------------------+
| Processed | Network error or server crashes |
| Half-processed | Server crashes |
| Being processed | Timeout |
| Not processed | Server crashes |
+------------------+----------------------------------------+
The most common strategy to handle this type of error is retry. Some RPC frameworks may automatically retry on timeout, without the application code knowing it. (This might be a dangerous thing if the API is not idempotent.)
Why Is Idempotency Important for Messaging?
For messaging-based systems, there are two distinct failure scenarios, duplicate publishing and duplicate fetching, which are illustrated in the following two diagrams.
If the API is idempotent, the API callers can handle errors easily by simply retrying. Similarly, if the message handler is idempotent, it will allow at-least-once delivery (i.e. duplicate message publishing or fetching)
A Three-Step Approach
Before we present a solution, let’s consider the relationship between idempotency and concurrency.
Would idempotency issues arise in concurrent applications?
- Yes. Reasons include duplicate messages, duplicate requests.
Would idempotency issues arise in single-threaded applications?
- Yes. Ditto.
The only difference is that it’s slightly easier to handle in a single-threaded application, because not sharing data between requests & not updating the same data means no data race.
Hence, as a general solution, we need to
- Serialize concurrent requests by locking the resources;
- Check for duplicate request(s), return immediately if duplicate and release the resources;
- Execute the application logic if not duplicate and release the resources.
How to uniquely identify a request/operation?
To be able to distinguish a duplicate operation (API call, or event), we need an idempotency key. Typically, the idempotency key is a combination of request_id
and resource_id
. The exact identifier to be used depends on the application. For example, when reserving the inventory for an order, the order_id
and sku_id
may be used as the idempotency key.
The actual implementation would need to store the resource modification history (i.e. operation log) in some kind of database so that the next attempt can be checked for duplication. It also requires the ability to update the resource and the operation log atomically, in order to guard against failures. The processing flow is illustrated as follows:
Example
Let’s demonstrate how to apply the lock-check-execute approach to a familiar example: inventory reservation. As mentioned earlier, the order_id
and sku_id
can be used as the idempotency key.
Assume the MySQL tables are defined as follows:
CREATE TABLE inventory_tab (
`sku_id` INT NOT NULL PRIMARY KEY,
`count` INT NOT NULL DEFAULT 0,
`price` DOUBLE NOT NULL DEFAULT 0,
`updated_at` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
CREATE TABLE inventory_dedup_tab (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
`order_id` BIGINT NOT NULL,
`sku_id` INT NOT NULL,
`quantity` INT NOT NULL DEFAULT 0,
`updated_at` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY (`order_id`, `sku_id`)
);
And the pseudo code of the processing logic is shown below:
START TRANSACTION;INSERT INTO inventory_dedup_tab(order_id, sku_id, `quantity`)
VALUES(@OrderNumber, @Sku, @Quantity);UPDATE inventory_tab set count=count+@Quantity
WHERE sku_id=@Sku AND count+@Quantity>=0;
SELECT ROW_COUNT() INTO @affected_rows;IF (@affected_rows > 0) THEN
COMMIT;
ELSE
ROLLBACK;
END IF;
The processing flow for this particular example would then look like this:
Of course, locking is expensive and distributed locking is even error prone, we should try to avoid it if possible. In this example, we can actually skip checking for duplication altogether, because the operation log is able to guarantee only one attempt will succeed. The simplified flowchart is as shown below:
Idempotency vs Optmistic Locking
Again, I’d like to highlight that optimistic locking does not guarantee idempotency. Please refer to my earlier post for a bit more details: https://medium.com/@zongwb/optimistic-locking-idempotency-936b87d1adc0
Wrap up
Idempotency is a desired property for any API to have, as it greatly simplifies the calling clients. I would urge all API developers to keep this in mind and strive for idempotency whenever possible. It is also a good practice to document if an API is idempotent and the side effects if not.
Again, thanks for reading :-)