Playing with property-based tests

I’ve never really tried property-based testing in anger before (just a brief play with QuickCheck during an ill-fated attempt to learn me some Haskell), but I recently came across a practice kata that seemed perfectly suited, so thought I’d give it a go. Unfortunately I haven’t got round to reading the book that I got in a recent PragProg sale, so it was mostly fumbling around and a lot of Googling.

The problem, basically, is this:

Given a list of names, generate a set of randomised “secret Santa” gift tags, such that no-one ends up giving themself a present.

This kind of thing isn’t always easy to test using traditional example-based testing, because of its intrinsic unpredictability, but using a property-based approach instead allows us to define a set of rules that should always hold true, and have the test framework automatically throw a variety of input data at your code and check that the rule isn’t broken.

The rules I came up with were:

returns the same number of tags as supplied names
includes each name once as a sender
includes each name once as a recipient
does not tag anyone’s present as from themself

The framework I used was PropCheck, which is an Elixir wrapper around the Erlang proper library. It acts as an extension to ExUnit, so you can mix and match example- and property-based tests.

PropCheck comes with a number of built-in generators, which will produce instances of primitive types like integers, strings, lists etc. Strings behave a bit oddly, thanks to the way Elixir strings are based on Erlang binaries, whereas Erlang natively uses charlists (also supported, but not commonly used, in Elixir). When QuickCheck generates binaries, it tends towards using zeros as the individual codepoint values, rather than a printable character. This means you’ll see values like <<0, 1, 0>>, rather than "ABA" (which is equivalent to <<65, 66, 65>>).

Initially, I picked a simple list of binaries as the input, meaning I could define my first property like this:

property "returns the same number of tags as supplied names" do
  forall names <- list(binary()) do
    length(SecretSanta.tags(names)) == length(names)
  end
end

For the later properties though, it became clear that not all lists were valid input. Specifically neither empty nor single-element lists really made sense, and neither did zero-character names. This is one of the advantages of this kind of testing â€“ the framework will come up with all sorts of edge cases, forcing you to stop and think whether they need to be handled, or explicitly excluded.

Restricting the input data meant writing a simple custom generator. I actually split it up into several smaller generators, which built on each other. First, let’s have some more realistic names. For simplicity, let’s say a firstname and a surname are each eight characters long (and are mysteriously made up of non-printable characters!):

defp name() do
  let([first <- binary(8), second <- binary(8)], do: first <> " " <> second)
end

The let macro allows us to take some existing generators, and transform their output â€“ in this case joining two strings with a space to make a name.

Now we can create a list of names. My first approach to making a minimum length list didn’t work, but after getting some help via a GitHub issue I ended up with this, which basically says “append two names to a list of names”. That list may be empty, but by appending two values to it we are guaranteed a minimum of two elements:

defp at_least_two_names() do
  let([first <- name(), second <- name(), many <- list(name())],
    do: [first, second | many]
  )
end

Again, although it seemed at first that this was enough, the tests revealed another hidden assumption: the names in the list need to be unique. I extended the previous generator to enforce this using such_that. This applies a guard to the generated values, forcing a new one to be produced if the condition is not met. If it can’t find a suitable value after a configurable number of tries, it will give up and fail the test.

defp at_least_two_unique_names do
  such_that(names <- at_least_two_names(), when: length(Enum.uniq(names)) == length(names))
end

With all those in place, here are the implementations of the properties listed earlier:

property "returns the same number of tags as supplied names" do
  forall names <- at_least_two_unique_names() do
    length(SecretSanta.tags(names)) == length(names)
  end
end

property "includes each name once as a sender" do
  forall names <- at_least_two_unique_names() do
    tags = SecretSanta.tags(names)
    names -- Enum.map(tags, & &1.from) == []
  end
end

property "includes each name once as a recipient" do
  forall names <- at_least_two_unique_names() do
    tags = SecretSanta.tags(names)
    names -- Enum.map(tags, & &1.to) == []
  end
end

property "does not tag anyone's present as from themself" do
  forall names <- at_least_two_unique_names() do
    tags = SecretSanta.tags(names)
    tags |> Enum.all?(&(&1.from != &1.to))
  end
end

And for completeness, here’s my implementation of the algorithm. It may not be pretty or efficient, but at least I know it works!

defmodule SecretSanta do
  def tags(names) do
    allocate(Enum.shuffle(names), Enum.shuffle(names))
  end

  defp allocate([], []), do: []

  # guard against leaving the same last person in both lists
  defp allocate([from, name], [to, name]) do
    allocate([from, name], [name, to])
  end

  # if the two heads are the same, shuffle the second list
  defp allocate(from, to) when hd(from) == hd(to) do
    allocate(from, Enum.shuffle(to))
  end

  defp allocate([from | from_tail], [to | to_tail]) do
    [%{from: from, to: to} | allocate(from_tail, to_tail)]
  end
end

Leave a Reply