We all want guarantees. We can get guarantees on our automobiles, homes, phones, shoes, food, and even our pets. We’ll take a guarantee in, on, or about anything. As software developers, our clients want guarantees for a whole host of things, from delivery dates to service uptime. Some of those are difficult to promise.
What if we need to provide a guarantee around selling limited-stock items? Let’s imagine ourselves as software developers engaged in delivering an application for managing inventory. Multiple users will access this application at any given moment to update stock levels and sales of our client’s famous and highly-desirable widgets. With so much activity updating the state of our data, we need a straightforward way to avoid race conditions. Follow along to see how we use Ecto.Repo.update_all/3 to provide these guarantees.
Our first pass at handling this behavior may be a little naive, but we’re confident we can solve this problem.
@doc """
Take one from the widget inventory count.
"""
def decrease_widget_inventory_count_naive(%Widget{} = widget, amount)
when is_integer(amount) and amount > 0 do
if widget.inventory_count >= amount do
widget
|> Ecto.Changeset.change(
%{inventory_count: widget.inventory_count - amount}
)
|> UpdateAll.Repo.update()
else
{:error, :failed_out_of_stock}
end
end
Above, we get the current stock level from the database. Once we have that, we also check that the requested amount exceeds or equals the amount we have in stock; otherwise, we return an error to the user. If we have the amount in stock, we create a changeset with the modified stock amount and pass that to the Repo module to update the database. Job done! We can pat ourselves on the back and head to the kava bar, right?
What if more than one user at a time is purchasing Fancy Widget Co.’s widgets through our application? Hmm. That might actually be a problem. We capture the inventory_count with no guarantee that the value hasn’t changed between the time we capture it and the time we update the database. What if someone else has changed this widget’s inventory_count since then? We would be writing state to our database that’s incorrect. That’s problematic at best.
Let’s write a test real quick to see if this is as bad as we think it is.
test "two users buy a widget at the same time only one widget is available" do
# each user fetching the value from the database at the same time
{:ok, widget} = Inventory.create_widget(
%{name: "BinaryNoggin Conference", inventory_count: 1}
)
# each user getting their widget with the same set of data
tasks = [
Task.async(Inventory, :decrease_widget_inventory_count_naive, [widget, 1]),
Task.async(Inventory, :decrease_widget_inventory_count_naive, [widget, 1])
]
# capture results from both users
good_sales = tasks
|> Task.await_many()
|> Enum.filter(&(match?({:ok, _}, &1)))
|> Enum.count()
updated_widget = Repo.reload!(widget)
# oh no! both users have purchased a widget when we only had one to sell!
assert updated_widget.inventory_count == widget.inventory_count - good_sales
assert updated_widget.inventory_count >= 0
end
1) test widgets two users 'buy' a widget at the same time when only one widget is available
test/update_all/inventory_test.exs:61
Assertion with == failed
code: assert updated_widget.inventory_count == widget.inventory_count - good_sales
left: 0
right: -1
stacktrace:
test/update_all/inventory_test.exs:80: (test)
Finished in 0.09 seconds (0.00s async, 0.09s sync)
10 tests, 1 failure, 9 excluded
With only one item in stock according to our database, we’ve allowed two users to each purchase a widget. We’ll have some explaining to do if we ship this. Alright, let’s take another shot at this and see if Ecto.Repo.update_all/3 can help us provide the guarantee we’re looking for.
@doc """
Updates widget inventory count by amount
## Examples
iex> update_widget_inventory_count(widget, 10)
{:ok, :successful}
{:error, :failed_out_of_stock}
"""
def decrease_widget_inventory_count(widget, amount)
when is_integer(amount) and amount > 0 do
negative_amount = -amount
query = from(
w in Widget,
where: w.id == ^widget.id and w.inventory_count >= ^amount,
select: w,
update: [inc: [inventory_count: ^negative_amount]]
)
case UpdateAll.Repo.update_all(query, []) do
{1, [widget]} -> {:ok, widget}
_ -> {:error, :failed_out_of_stock}
end
end
Let’s walk through the above function a bit and see if we’ve made a better choice than before. We’ve written a query to check for our widget and to see if there is sufficient inventory_count for us to purchase. Also, as part of our query, we’ve included an update operation. Passing the inc or increment operator to the update operation allows us to directly change the value in the database only if the conditions of the query are met. We previously stored an inverted value of our amount since there is no dec or decrement operator. We get a return tuple with the number of items updated and a nil. Any other results would indicate an error that we pass back to the user.
This bears all the hallmarks of an acceptable solution. Let’s put it to the test.
test "two users 'buy' a widget at the same time when only one is available" do
# each user fetching the value from the database at the same time
{:ok, widget} =
Inventory.create_widget(
%{name: "BinaryNoggin Conference", inventory_count: 100}
)
# each user getting their widget with the same set of data
tasks = [
Task.async(Inventory, :decrease_widget_inventory_count, [widget, 1]),
Task.async(Inventory, :decrease_widget_inventory_count, [widget, 1])
]
# capture results from both users
good_sales = tasks
|> Task.await_many()
|> Enum.filter(&(match?({:ok, _}, &1)))
|> Enum.count()
updated_widget = Repo.reload!(widget)
assert updated_widget.inventory_count == widget.inventory_count - good_sales
assert updated_widget.inventory_count >= 0
# well look at that we've prevented a false sale!
# we've avoided a really uncomfortable problem with atomicity.
# we've now proven our guarantee!
end
Excluding tags: [:test]
Including tags: [line: "84"]
.
Finished in 0.08 seconds (0.00s async, 0.08s sync)
10 tests, 0 failures, 9 excluded
We’ve just executed an atomic action. With our database acting as our ‘traffic cop’, we cannot exceed the number of available widgets when selling to client customers. We have avoided a messy business situation involving apologies, hurt feelings, and wasted time. All other things being equal, we can now offer this guarantee to our clients.
There are caveats when using tools and update_all/3 is no exception. When using update_all/3, only the fields included in your query are updated. Ecto-generated timestamps are not updated unless you specifically include and update them.
Using Ecto.Repo.update_all/3 allows us to make promises to our clients and their customers that we can feel good about. There are, of course, a number of other uses for update_all/3. Reference the docs here for more information about it.
Update - 2022/12/21
We've received some concerns challenging the assertion that what we describe above is an atomic action. When we referenced the documentation for transaction isolation in Postgres 15 we found the following:
Read Committed is the default isolation level in PostgreSQL. When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transactions. In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run. However, SELECT does see the effects of previous updates executed within its own transaction, even though they are not yet committed. Also note that two successive SELECT commands can see different data, even though they are within a single transaction if other transactions commit changes after the first SELECT starts and before the second SELECT starts.
Our original concern was avoiding a state change from another transaction or event while we are in the process of executing _our_ transaction. So our assertion that using `update_all/3` is _atomic_ does not hold up.
This did lead us to try to test the control we get from using our `update_all/3` solution and to ask, "What do we want from this code?" Our answer was simple, "We don't want to sell widgets when there are none in stock." Let's explore the test we created to poke at our code:
@tag timeout: 130_000
test "more purchase attempts than re-stocks attempts are made" do
# each user fetching the value from the database at the same time
{:ok, widget} = Inventory.create_widget(
%{name: "BinaryNoggin Conference", inventory_count: 1}
)
# return actions for evaluation
tasks = [
Task.async(__MODULE__, :increase_widget_count, [widget, 500]),
## Note to the reader: reduced for the sake of brevity
Task.async(__MODULE__, :decrease_widget_count, [widget, 700])
]
# capture results from many users
actions = tasks
|> Task.await_many(130_000)
IO.inspect(widget.inventory_count, label: "Original stock")
good_increments = actions
|> Enum.filter(&(match?{:increased, _count}, &1))
|> Enum.reduce(0, fn {:increased, count}, acc -> acc + count end)
|> IO.inspect(label: "Re-stocks")
good_sales = actions
|> Enum.filter(&(match?{:decreased, _count}, &1))
|> Enum.reduce(0, fn {:decreased, count}, acc -> acc + count end)
|> IO.inspect(label: "Sales")
updated_widget = Repo.reload!(widget)
IO.inspect(updated_widget.inventory_count, label: "Final count")
assert updated_widget.inventory_count ==
widget.inventory_count + good_increments - good_sales
assert updated_widget.inventory_count == 0
end
This test should apply a significant amount of pressure from multiple 'users' buying widgets and restocking the widgets at the same time. Let's run it and see the results:
Excluding tags: [:test]
Including tags: [line: "84"]
Original stock: 1
Re-stocks: 6145
Sales: 6146
Final count: 0
.
Finished in 24.1 seconds (0.00s async, 24.1s sync)
We've run the above test over a hundred times and we haven't been able to create a fail condition around selling stock that doesn't exist. We have failed for insufficient database connections or waiting too long for a database connection to become free for the `Repo` to use.
Currently, our takeaway from all of this, is that you can probably solve the example problem with code like this. Some monitoring/alerting around these actions using something like `telemetry` would be advisable.
If you require absolute guarantees you should probably be looking into Transaction Isolation in the Postgres docs. If you can bare some 'slippage' monitoring and the `inc`, and `dec` behavior exposed in `update_all/3` can solve your problem without the performance cost of row/table locking or transaction isolation.
Thank you so much to those who engaged with us on this topic. We relish the opportunity to discuss these concepts.