## 4  Lazy pattern matching

### 4.1  Lazy pattern matching in theory

In previous sections we only considered strict ML. In a strict language we can define ML pattern matching as a predicate operating on terms, which also are the values of program expressions. In other words, pattern matching apply to completely evaluated expressions, or normal forms.

Studying pattern matching in a lazy language such as Haskell requires a more sophisticated semantical setting. Essentially, lazy language manipulate values that are known partially and, more significant to our study, pattern matching operates on such incomplete values.

 v ::= Partial Values Ω Undefined value c(v1,v2, … ,va) (constructor) head-normal form.

Definition 1 of the instance relation for patterns and values apply unchanged to partial values (we have _ ≼ Ω, value Ω possesses all types). Thus we keep the same notation ≼, and maintain Definition 3 of the instance relation for matrices.

However we cannot keep Definition 2 of ML pattern matching.

Example 3   Let us consider a simple example2: case e of True -> 1 | _ -> 2. If the root symbol of expression e is not a constructor, its partial value is Ω. Then, since True ⋠Ω and _ ≼ Ω, the value of the whole expression is 2. But, if we compute e further, its value may become True. Then, the value of the whole expression becomes 1. Something is wrong, since the value of the whole expression changed from 1 to 2.

More generally, partial values and computation interact. Let us consider some expression e. If the root symbol of expression e is a constructor c, then expression e is a head-normal form and we express the “current value” of e as c(v1,v2, … ,va) — see Huet and Lévy [1991] for a more precise treatment in the context of term rewriting systems. Otherwise, the “current value” of expression e is Ω. Then, we consider various “current values” along the evaluation of e. As constructors cannot be reduced, those values are increasing according to the following precision ordering.

Definition 7 (Precision ordering)   Relation ≼Ω is defined on pairs of values (v, w) as follows.
 Ω ≼Ω w c(v1, … , va) ≼Ω c(w1, … , wa) iff (v1⋯va) ≼Ω(w1⋯wa) (v1⋯vn) ≼Ω (w1⋯wn) iff vi ≼Ωwi, for all i ∈ [1…n]

To be of practical use, a predicate P that defines pattern matching must be monotonic. That is, when P(v) holds, P(w) also holds for all w such that vΩw. With monotonic predicates, matching decisions do not change during computations. One should notice that, given any pattern p, the predicate pv is monotonic in v. Example 3 shows that the predicate Pv is not monotonic in general.

We thus need a new definition of pattern matching. For the moment, we leave most of lazy pattern matching unspecified.

Definition 8 (General (lazy) pattern matching)   Let P(P, v) be a predicate defined over pattern matrices P and value vectors v, where the size n of v is equal to the width of P. Row number i in P filters v, if and only if the following condition holds:
P(P[1…i)
 → v
) ∧
 → p
i ≼
 → v
.

We call P the disambiguating predicate and now look for sufficient conditions on P that account for our intuition of pattern matching in a lazy language.

1. Pattern matching is deterministic, in the sense that at most one clause is matched. Hence, for all P and v, we assume:
P(P
 → v
) =⇒ P ⋠
 → v
.
2. Matching the first row of a matrix reduces to the instance relation. Hence, for all v, we assume:
P(∅,
 → v
) = True.
3. We require predicate P to be monotonic in its value component. That is, given any matrix P, for all value vectors v and w, we assume:
P(P
 → v
) ∧
 → v
≼Ω
 → w
=⇒ P(P
 → w
).

The three conditions above are our basic restrictions on P. We further define UP and MP as U and M (Definition 6) parameterized by P.

Now, given a definition of lazy pattern matching, we face the temptation to assume that the computation of U described in Section 3.1 still works for UP. More precisely, by finding additional sufficient conditions on predicate P we aim at proving UP = Urec. Thus, we re-examine the proof of Proposition 2 in the context of lazy pattern matching.

Base cases follow from basic restrictions.

1. By our first basic restriction and since ( ) ≼ (), we have P(( ), ()) = False. Thus we have UP(( ), ()) = False.
2. By our second basic restriction, we directly get UP(∅, q ) = True.

To prove the inductive cases, it suffices to reformulate the key properties of Lemma 1, replacing Pv by P(P, v). However, key properties now are rather assumed than established.

Definition 9 (Key properties)   We say that predicate P meets key properties when the following four properties hold. For any matrix P, constructor c, and value vector v such that v1 = c(w1, … , wa), we assume:
P(P
 → v
)  ⇐⇒ P(S(cP), S(c
 → v
)).      (1)
Additionally, for any value vector v, we assume:
 P(P, (v1 v2⋯vn)) =⇒ P(D(P), (v2⋯vn)).      (2)
Furthermore, given any matrix P, let Σ be set of the root constructors of P’s first column. If Σ is not empty, then for any constructor c not in Σ and any value vector (w1wa   v2vn), we assume:
 P(D(P), (v2⋯vn)) =⇒ P(P, (c(w1, … , wa) v2⋯vn)).      (3)
If Σ is empty, then, for any value vector v, we instead assume:
 P(D(P), (v2⋯vn)) =⇒ P(P, (v1 v2⋯vn)).      (4)

It is not obvious that assuming key properties suffices to prove that UP can be computed as U is, since Ω does not show in the proof of Proposition 2. Indeed, monotonicity plays some part here.

Proposition 3   We have UP = Urec.
Proof: Base cases follow from basic restrictions; while the proofs of all inductive cases in Proposition 2, except 2-(a), apply unchanged.

Hence, we assume q1 to be a wildcard and the set Σ to be a complete signature. We need prove:

UP(P, ((_:tq2qn)) =
 z ∨ k=1
UP(S(ck,P), S(ck
 → q
)).

By (1), for any constructor ck in Σ and any vector v such that v1 = ck(w1, … , wak), we have: vMP(P, q )  ⇐⇒ S(ck, v) ∈ MP(S(ck, P), S(ck, q )). Hence, a potential difficulty arises for vectors v in MP(P, q ), when v1 is Ω. Then, by monotonicity of P (and ≼), the non-empty type axiom, and for any constructor c (in the signature of type t), there exists (w1wa) such that (c(w1, … , wav2vn) ∈ MP(P, q ). That is (by (1) in the forwards direction), UP(S(ck, P), S(ck, q)) holds for all ck in Σ. □

As an immediate consequence of the proposition above, the useless clause problem is now solved in the lazy case (see second item in Proposition 1). However, the exact formulation of exhaustiveness needs a slight change. Reconsider Example 3.

```case e of True -> 1 | _ -> 2
```

By definition of ≼, True ≼ Ω does not hold, hence Ω cannot match first clause. However, P((

 True

), Ω) does not hold either, by monotonicity. Hence, there is no row that filters value Ω and Definition 4 would flag this matching as non-exhaustive, a clear contradiction with our intuition of exhaustiveness. Thus, we now directly define an exhaustive matrix P from the condition UP(P, (_⋯_)) = False. That is, P is exhaustive, if and only if for all vectors v, P(P, v) does not hold. By this new definition, the example is exhaustive: for any value v in {Ω,False, True}, we have _ ≺ v. Thus we have:

P =

 True _

≼ v.

Hence, by our first basic restriction, for all v, P(P, v) does not hold.

### 4.2  Lazy pattern matching, à la Laville

Laville’s definition of lazy pattern matching Laville [1991] stems directly from the need of a monotonic P: if we decide that some term is evaluated enough not to match a pattern, then we want this to remain true when the term is evaluated further. By definition, matrix P and value v  are incompatible, written P #v , when making v  more precise cannot produce an instance of P. That is, P #v  means

∀
 → w
 → v
≼Ω
 → w
=⇒ P ⋠
 → w
.
Definition 10 (Lazy pattern matching (Laville))   Define P(P, v) = P #v in the generic definition 8.

The first basic restriction follows by letting w be v in the definition of P #v, the second restriction follows from ∅ ⋠w for all w, and monotonicity is a consequence of the transitivity of ≼Ω.

Incompatibility is the most general P in the following sense: for any predicate P, any matrix and any value vector v, we have.

P(P
 → v
) =⇒ P #
 → v

For, if P and v are compatible (i.e. not incompatible), then there exists w, with vΩw and Pw. Thus, by the first basic restriction, P(P, w) does not hold, and, by the monotonicity of P, P(P, v) does not hold either.

Incompatibility is easily computed by the following rules.

c(p1, … , pa)#c′(v1, … , va)(where cc′)
c(p1, … , pa)#c(v1, … , va)iff (p1pa) #(v1va)
(p1pn)#(v1vn)iff there exists i ∈ [1…n], pi #vi
(p1p2)#viff p1 #v and p2 #v

P#
 → v
iff for all i ∈ [1…n], p i #v

It is routine to show that incompatibility meets key properties. Hence, by Proposition 3, algorithm Urec is correct with respect to Laville’s semantics.

Laville’s definition is quite appealing as a good, implementation independent, definition of lazy pattern matching. However, there is a slight difficulty: predicate P #v is not sequential in the sense of Kahn and Plotkin [1978] in v for any matrix P. This means that its compilation on an ordinary, sequential, computer is problematic Maranget [1992], Sekar et al. [1992]. As a consequence, the Haskell committee adopted another semantics for pattern matching. Their definition is aware of the presence of Ω and solves the difficulty by specifying left-to-right testing order.

### 4.3  Pattern matching in Haskell

By interpreting the Haskell report Hudak et al. [1998] we can formulate a pattern matching predicate for this language. Matching can yield three different results: it may either succeed, fail or diverge. Furthermore, matching of arguments is performed left-to-right. We encode “success”, “failure” and “divergence” by the three values T, F and ⊥, and define the following H function.

 H(_, v) = T H(c(p1, … , pa), Ω) = ⊥ H(c(p1, … , pa), c′(v1, … , va′)) = F   (where c ≠ c′) H(c(p1, … , pa), c(v1, … , va)) = H((p1⋯pa), (v1⋯va)) H((p1 p2⋯pn), (v1 v2⋯vn)) = H(p1, v1) ∧⊥H((p2⋯pn), (v2⋯vn)) H((), ()) = T

Where the extended (left-to-right) boolean connectors are defined as follows.

 T  ∧⊥x = x F  ∧⊥x = F ⊥ ∧⊥x = ⊥

 T  ∨⊥x = T F  ∨⊥x = x ⊥ ∨⊥x = ⊥

We ignore some of Haskell patterns such as irrefutable patterns. We also ignore or-patterns at the moment. From this definition one easily shows the following two properties on vectors:

H(
 → p
 → v
) = T  ⇐⇒
 → p
≼
 → v
H(
 → p
 → v
) = F =⇒
 → p
#
 → v
=⇒
 → p
⋠
 → v
.

We then interpret the many program equivalences of section 3.17.3 in the Haskell report as expressing a downward search for a pattern of which v is an instance of:

H(∅,
 → v
)  = F          H(P
 → v
) = H(
 → p
1
 → v
) ∨ H(P[2…m]
 → v
).

Informally, H(P, v ) = T means “v is found to match some row in P in the Haskell way”, H(P, v ) = F means “no row of P is found to be matched”, and H(P, v ) = ⊥ means “v is not precise enough to make a clear decision”. We can now formulate the Haskell way of pattern matching in our setting.

Definition 11 (Haskell pattern matching)   Define P(P, v ) to be H(P, v ) = F in the generic Definition 8.

One easily checks that predicate H(P, v ) = F meets all basic restrictions and key properties (decomposing along first columns is instrumental). Hence, save for or-patterns, algorithm Urec also computes the utility of pattern matching in Haskell.

As this work is partly dedicated to specific warnings for or-patterns, we wish to enrich Haskell matching with or-patterns. The H function is extended to consider or-patterns, sticking to left-to-right bias:

 H((p1∣p2), v) = H(p1, v) ∨⊥H(p2, v).

Semantical consequences are non-negligible, since the equivalence H(p, v ) = T  ⇐⇒ pv does not hold any more, as can be seen by considering H((True∣_), Ω) = ⊥.

However, the left-to-right implication still holds, and the following definition of Haskell pattern matching makes sense.

Definition 12 (Haskell matching with or-patterns)   Let P be a pattern matrix and v be a value vector. Vector v matches row i in P, if and only if the following proposition hold:
H(P[1…i)
 → v
) = F     ∧     H(
 → p
i
 → v
) = T.
From this definition of matching, we define the utility predicate UH and the set of matching values MH as we did in Definition 6.

The definition above is not the application of the generic definition 8 to P(P,v ) = (H(P, v ) = F ), because we have written H(p i, v ) = T  in place of the instance relation p iv . However, as illustrated by the following lemma, H(q , v ) = T and q  ≼ v  are closely related.

Lemma 2   Let p be a pattern and v be a value such that H(p, v) = ⊥. There exists value w such that vΩw and H(p, w) ≠ ⊥. Furthermore, if pv, then H(p, w) = T.
Proof: We first prove the existence of w by induction on p.
• If p = c(p1, … , pa), then, by hypothesis H(p, v) = ⊥, we have two sub-cases.
• Value v is c(v1, … , va), with H(pi, vi) = ⊥ for i in some (non-empty) index set I. Applying induction hypothesis to all such i yields values vi such that viΩvi and H(pi, vi) ≠⊥. Then, we define w = c(w1, … , wa) where wi = vi for iI, and wi = vi otherwise.
• Otherwise, value v is Ω. Let v′ be c(Ω, … , Ω). If H(p, v′) is not ⊥, then we define w = v′. Otherwise, we reason as in the previous case.
• If p = (q1q2), we have two sub-cases.
• If H(q1, v) = F and H(q2, v) = ⊥, then (by induction) there exists a value w such that H(q2, w) ≠ ⊥. Finally, by definition of H, we have H(p, w) = H(q2, w) ≠ ⊥.
• If H(q1, v) = ⊥, then (by induction) there exists a value w′, such that vΩw′ and H(q1, w′) ≠ ⊥ If H(q1, w′) = T, we define w to be w′ and we conclude. Otherwise, H(q1, w′) = F and thus H((q1q2),w′) = H(q2, w′). Then we conclude, either directly, or by induction in the case where H(q2, w′) = ⊥.
Additionally, H(p, w) = T holds under the extra hypothesis pv, by H(p, w) = F =⇒ pw and by the monotonicity of ≼. □

The lemma above suffices to relate Haskell matching to generic lazy matching, and thus to compute the utility of Haskell matching.

Proposition 4   We have UH = Urec.
Proof: We note UH the utility predicate that results from the generic definition, taking P(P,v ) to be H(p, v ) = F. From generic proposition 3, we have UH = Urec. (Formally we check that predicate H(P, v ) = F meets key properties even when some of the patterns in P are or-patterns).

Then we show UH = UH. From the implication H(q, v ) = T =⇒ qv , we have UH(P, q ) =⇒ UH(P,q ); the converse implication follows from Lemma 2. □

It is time to clearly stress on some important consequence of propositions U = Urec, UP = Urec and UH = Urec: all our utility predicates are in fact equal. This suggests a quite powerful and elegant“semantical” proof technique, which we immediately demonstrate.

Lemma 3 (Irrelevance of column order)   Let P be a pattern matrix and q be a pattern vector. By permuting the same columns in both P and q we get matrix P′ and vector q ′. Then we have UH(P, q ) = UH(P′, q ′).
Proof: Consider strict matching. Since predicates Pv and qv do not depend on column order, we have U(P, q ) = U(P′, q ′). From UH = U, we conclude. □

First observe that proving the irrelevance of column order for Haskell matching by induction on matrix and pattern structure would be quite cumbersome.

Also notice that the lemma above is not obvious, since Haskell matching depends upon column order in a strong sense. For instance, let P, q , P′ and q ′ be as follows.

P =
 True False

 → q
= (False   _)    P′ =
 False True

 → q
′ = (_   False).

Matrix P′ (resp. vector q ′) is P (resp. q ) with columns swapped. The sets of matching values are as follows.

MH(P
 → q
)
={  (False   Ω), (False   True), (False   False) }
MH(P′,
 → q′
)
={  (True   False), (False   False) }

Swapping the components of the elements of MH(P, q ) does not yield MH(P′, q ′), since (Ω  False) does not belong to MH(P′, q ′). However, some of the values of the MH sets above are related by the permutation. Moreover, the equality UH=U can be seen as telling us that there is at least one such value.

From now on, we simply write U for any utility predicate, regardless of semantics. We also write “algorithm U” for Urec.