A core Elixir data type that we've overlooked till now is its Stream type. From the documentation: "Streams are lazy, composable enumerables". Let's have a look at what that means.


We'll be exploring streams in iex.


The simplest stream I can think of just repeats a single number:

Stream.repeatedly(fn() -> 1 end) |> Enum.take(10)

That's not necessarily something we'd want though. What if you want a stream of random numbers?

Stream.repeatedly(&:random.uniform/0) |> Enum.take(10)

Alright, what if you want to repeat a given static collection in a stream?

Stream.cycle([1,2,3]) |> Enum.take(10)

It's very common to have a file that is too large to efficiently load into memory, but perhaps you want to perform an operation on each line - generate a report, sum some fields in a CSV, etc. You can generate a stream around the file using Stream.resource/3. We'll generate a stream of the system's dictionary file:

file_stream = Stream.resource(fn -> File.open!("/usr/share/dict/words") end,
                              fn(file) -> 
                                case IO.read(file, :line) do
                                  data when is_binary(data) -> {data, file}
                                  _ -> nil
                              fn(file) -> File.close(file) end)

Here, the first argument provides the resource. The second argument describes how to iterate through the resource. It returns a tuple containing two elements: the data to be returned from this iteration, and the resource. If it returns nil, the stream is considered finished. Finally, the last function describes how to close out the resource when the stream is done.

Let's take 10 words from the dictionary, each 200 words apart:

file_stream |> Stream.take_every(200) |> Enum.take(10)

If you want to see the difference in lazy evaluation and eager evaluation, change the take_every call to be from the Enum module. I'm not going to do it here because it's boring, but the call takes around 3 minutes to complete for me.

Let's create a stream which cycles the 2001st through 2005th words in the dictionary:

file_stream |> Stream.drop(2000) |> Stream.take(4) |> Enum.to_list |> Stream.cycle |> Enum.take(20)

I should note that I wouldn't expect the call to Enum.to_list to be required, but Stream.cycle doesn't repeat if you take it out. I've filed this as a bug against the language since I would expect Stream to be a fine analogue for Enum in most cases, and this one of them. I added a link to it in the Resources section if you're interested.

You can still get items with their indices, which is convenient in lots of ways:

file_stream |> Stream.drop(2000) |> Stream.take(4) |> Enum.to_list |> Stream.cycle |> Stream.with_index |> Enum.take(20)

We can also easily filter results from the stream. Let's reject any words with a, b, c, d, or e in them. The elements that return true from the function in the filter will be the ones returned, just as a reminder:

# Don't forget to write this one from scratch, unlike the above derp
file_stream |> Stream.filter(fn(x) -> !Regex.match?(~r/[abcde]/i, x) end) |> Enum.take(20)

You can still use filter_map:

file_stream |> Stream.filter_map(fn(x) -> !Regex.match?(~r/[abcde]/i, x) end, fn(x) -> String.upcase(x) end) |> Enum.take(20)


Alright, that's not all of the functions available in Stream, but it's a pretty representative sample and a nice intro. See you soon!