Notes

GenServer APIs

GenServer.Behaviour doesn't exist anymore - it's now just GenServer.

Also, calls to :gen_server.call and :gen_server.cast should instead be GenServer.call and GenServer.cast.

The start_link definition should now be:

def start_link do
  GenServer.start_link(__MODULE__, [], name: :list)
end

Supervisor APIs

Supervisor.Behaviour doesn't exist anymore - it's now just Supervisor.

Calls to :supervisor should instead be Supervisor.

Introduction

Hello again, and welcome to ElixirSips Episode 022: OTP Part 4 - Supervisors. In today's episode, we're going to talk about a core concept in building systems in the Elixir way - "Let it crash." That is, don't concern yourself with keeping your code from crashing, just make sure that the system keeps running if something bad happens.

The way you manage this is by having your system be composed of multiple processes, managed by supervisors.

Supervisors

A supervisor exists to manage processes or other supervisors. There are various strategies defined in OTP to determine what to do when a child crashes, and there are ways to handle repeated failures of supervised processes.

Since we like to learn about things by playing with them, we'll lay out a project we're going to use a supervisor for.

Project

We're going to start off by building a ListServer that lets us manage a list of items.

Go ahead and create a new project:

mix new supervised_list_server
cd supervised_list_server

We'll write a quickie test for our server. Open up test/list_server_test.exs:

defmodule ListServerTest do
  use ExUnit.Case

  # Clear the ListServer before each test
  setup do
    ListServer.start_link
    ListServer.clear
  end

  test "it starts out empty" do
    assert ListServer.items == []
  end

  test "it lets us add things to the list" do
    ListServer.add "book"
    assert ListServer.items == ["book"]
  end

  test "it lets us remove things from the list" do
    ListServer.add "book"
    ListServer.add "magazine"
    ListServer.remove "book"
    assert ListServer.items == ["magazine"]
  end
end

If you run the tests, they fail because there's no ListServer, so let's get started. Open up lib/list_server.ex and just add this bit - it should all make sense, it's nothing we haven't done before, but it's good practice to see it some more:

defmodule ListServer do
  use GenServer.Behaviour

  ### Public API
  def start_link do
    :gen_server.start_link({:local, :list}, __MODULE__, [], [])
  end

  def clear do
    :gen_server.cast :list, :clear
  end

  def add(item) do
    :gen_server.cast :list, {:add, item}
  end

  def remove(item) do
    :gen_server.cast :list, {:remove, item}
  end

  def items do
    :gen_server.call :list, :items
  end

  ### GenServer API
  def init(list) do
    {:ok, list}
  end

  # Clear the list
  def handle_cast(:clear, list) do
    {:noreply, []}
  end
  def handle_cast({:add, item}, list) do
    {:noreply, list ++ [item]}
  end
  def handle_cast({:remove, item}, list) do
    {:noreply, List.delete(list, item)}
  end

  def handle_call(:items, _from, list) do
    {:reply, list, list}
  end
end

Alright, run the tests now and they should pass. I actually built this by running those tests iteratively, but that's not the focus of this episode so I didn't want to spend the time doing it.

Now we're going to add a misfeature - perchance a bug. We're adding a new function that will crash the ListServer. Add the following:

def crash do
  :gen_server.cast :list, :crash
end

#...

def handle_cast(:crash, list) do
  1 = 2
end

To verify that this causes a crash, let's open up an iex session and see it happen:

iex -S mix

Now start a ListServer, add some items, remove some items, list them, and then cause a crash. Try to list them after crashing it:

ListServer.start_link
ListServer.add "book"
ListServer.items
ListServer.add "cane"
ListServer.items
ListServer.remove "cane"
ListServer.items
ListServer.crash
ListServer.items

OK, so once we've crashed, we can't use it any more. This sucks, because Erlang and Elixir systems are supposed to be fault-tolerant, and here we made something crash and it didn't tolerate it. But of course, that's because we haven't finished building a proper system. This is where OTP's supervisors come in. Let's go ahead and add a supervisor.

Open up lib/list_supervisor.ex and add the following:

defmodule ListSupervisor do
  use Supervisor.Behaviour

  def start_link do
    :supervisor.start_link(__MODULE__, [])
  end

  def init(list) do
    child_processes = [ worker(ListServer, list) ]
    supervise child_processes, strategy: :one_for_one
  end
end

Now open up iex -S mix and try something very similar to what you did before:

ListSupervisor.start_link
ListServer.add "book"
ListServer.items
ListServer.add "cane"
ListServer.items
ListServer.remove "cane"
ListServer.items
ListServer.crash
ListServer.items

The last item was different - after crashing, we still had a list server! It just didn't have the state from before the crash. In the next episode, we'll work through storing state outside of our process, so we can survive crashes. That part's new to me, so it's fun! See you soon!