Defining infinite structures in Haskell
by Elias Hernandis • Published Oct. 20, 2019 • Tagged programming languages, haskell
This article is part of a series on Haskell about how easy it is to define infinite structures in really concise ways. Building on the previous article about lazy evaluation, we explore how to define infinite structures using recursion without using a base case. While in other languages this would result in an infinite loop, this does not happen in Haskell due to its powerful lazy evaluation strategy.
First steps: an infinite list of ones
Recall from the previous post that in Haskell, parameters are evaluated only when they are needed. Let us revisit a simple example. Consider the following function definition.
f :: Int -> Int -> Int
f a b = a * 2
Calling f
with parameters a = 2 + 3
and b = 3 * 3
would result in the following execution:
f a b
= f (2 + 3) (3 * 3)
= (2 + 3) * 2
= 5 * 2
= 10
Notice how the value of the parameter b
is never computed. More importantly,
if the parameter b
did not contain a simple arithmetic expression, but rather
some thing more complex, Haskell would not waste time computing its value.
Moreover, if the parameter b
was undefined
, i.e. if it could not be
computed, the function call f a b
would not fail.
Now consider the following definition:
ones :: [Int]
ones = 1 : ones
What do you think would happen if we called ones
? Well, ones
is defined
recursively, but it does not have a base case, so normally we would say that
execution does not terminate, or that a maximum depth of recursion is reached
and the program stops, or something along those lines. In Haskell, this is
still true, calling ones
directly will result in the program never stopping.
But, what if we called take 3 ones
. Remember that the function take n xs
returns the first n
elements of the list xs
. And indeed,
ghci> take 3 ones
[1,1,1]
This not only doesn't fail, but returns the expected result. Let's look at the definition of take.
take :: Int -> [a] -> [a]
take 0 _ = [] -- base case
take n (x : xs) = x : take (n - 1) xs -- recursive case
And let's look at how Haskell handles the call take 3 ones
.
take 3 ones
= -- n = 3 dictates we go into the recursive case
-- parameters are passed without being evaluated
take 3 (1 : ones)
= -- in the body of the recursive case,
-- pattern matching assigns x = 1, xs = ones
1 : take 2 ones
= -- n = 2 dictates we go into the recursive case
-- the same pattern matching assignments are made
1 : 1 : take 1 ones
= -- n = 1 dictates we go into the recursive case
1 : 1 : 1 : take 0 ones
= -- n = 0 dictates we go into the base case
1 : 1 : 1 : take 0 ones
= -- this time, the body of the base case does not depend on xs
-- so it is not evaluated and the function returns
1 : 1 : 1 : []
= -- syntactic sugar
[1,1,1]
Magic! As the expression take 3 ones
did not depend on the full, infinite,
list of ones
, only the relevant part was evaluated and Haskell was able to
correctly compute the expected result!
More interesting infinite definitions
Base-case-less recursions need not be that simple. Assume we want to represent all of the natural numbers in Haskell. We could easily do so with
nats :: [Int]
nats = 0 : map (+1) nats
And indeed,
ghci> take 10 nats
[0,1,2,3,4,5,6,7,8,9]
(Exercise: try to expand the call stack for take 2 nats
as we did before)
This is a common enough case that Haskell has special syntax1 for it:
nats = [0..]
-- and we can even start later
[2..]
The sieve of Erathostenes
Erathosthenes was a Greek mathematician who gave an algorithm to find all prime numbers up to a limit. The algorithm is similar to trial division, only it is constructive (i.e. instead of testing wether a number is prime, it finds all prime numbers up to a limit). Lets look at the pseudocode
- List all numbers starting with
2
untiln
, wheren
is the upper bound in our search for primes. - Take the first element
p
in the list and remove all multiples of $p$ from the list, i.e. remove2p, 3p, 4p, ...
from the list. Another way to think of this is, remove a numberx
ifmod x p == 0
. - Go back to the previous step starting with the first remaining number.
This algorithm may be used without an upper bound if one wishes to find all of the prime numbers. In most languages this would not be possible but it is in Haskell:
-- List all numbers starting with 2, until infinity
primes :: [Int]
primes = sieve [2 .. ]
-- take the first number p in the list
sieve (p : xs) = p : ...
-- remove all multiples of p from the list
sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p /= 0]
-- Done! (the definition above is recursive)
And indeed,
ghci> take 4 primes
[2,3,5,7]
Lazy evaluation is a powerful evaluation strategy that allows us to concisely and expressively write definitions of infinite structures. In future posts, we will explore other consequences of this design decision of the Haskell language and some situations in which it does not prove so useful.
-
You might have seen the syntax
[a..b]
before. It is a shorthand for the functionenumFromTo
defined in the enum class. In fact,[a..]
is also a shorthand for the methodenumFrom
. Integers in Haskell are an instance of theEnum
class that defines these methods. ↩