Closures
- We have considered the reflexive, symmetric, and transitive properties of relations.
- We know that some relations have these properties and some don't.
- The question here is: how can we turn a relation into one that does have these properties?
- That is, given a relation R, what is the least we can add to it to make it reflexive, symmetric, or transitive?
- … or maybe so it has some other property?
- More formally, for a property \mathbf{P} of relations, the closure of R for property \mathbf{P} is a relation R_c with…
- R_c has property \mathbf{P}.
- R\subseteq R_c.
- For every relation S with R\subseteq S, if S has property \mathbf{P} then R_c\subseteq S.
- Note: not every relation and property has a closure, but we can find them for the ones we're interested in.
Reflexive Closure
- To make a relation reflexive, all we need to do are add the “self” relations that would make it reflexive.
- For a relation on a set A, we will use \Delta to denote the set \{(a,a)\mid a\in A\}.
Theorem: The reflexive closure of a relation R is R\cup \Delta.
Proof Idea: Any reflexive relation containing R must contain R\cup \Delta. If not it would either not be a superset of R or not be reflexive.
- For example, the reflexive closure of the < relation is:
\{(a,b)\mid a<b\}\cup \{(a,b)\mid a=b\} = \{(a,b)\mid a\le b\}\,.
- That is, the reflexive closure of < is \le
Corollary: If R is a reflexive relation, the reflexive closure of R is R.
Proof: If R is reflexive, then \Delta\subseteq R. Thus, R\cup \Delta= R.∎
- For example, \le is its own reflexive closure.
Symmetric Closure
- It's also fairly obvious how to make a relation symmetric: if (a,b) is in R, we have to make sure (b,a) is there as well.
- We already have a way to express all of the pairs in that form: R^{-1}.
Theorem: The symmetric closure of a relation R is R\cup R^{-1}.
Proof Idea: Any symmetric relation containing R must contain R\cup R^{-1}. If not it would either not be a superset of R or not be symmetric.
- For example, the symmetric closure of < is:
\{(a,b)\mid a<b\} \cup \{(a,b)\mid a>b\} =\{(a,b)\mid a\ne b\}\,.
- So, the symmetric closure of < is \ne.
Corollary: If R is a symmetric relation, the symmetric closure of R is R.
Proof: If R is symmetric, then by an earlier theorem, R=R^{-1}. Thus, R\cup R^{-1}= R.∎
- For example, = is its own symmetric closure.
Transitive Closure
- Based on the above, your first thought about how to make the transitive closure might be: if both (a,b) and (b,c) are in the relation, then add (a,c) to make it transitive.
- But that doesn't work. Consider the relation with this graph:
- If we add all of the edges as described, we get this:
- That's still not transitive: we still need (a,d) in the relation to make it transitive.
- But that doesn't work. Consider the relation with this graph:
- We're going to have to be a little more clever…
- If we have a directed graph G, then a path in that graph is a sequence of edges (x_0,x_1),(x_1,x_2),\ldots,(x_{n-1},x_n).
- We will say that this is a path of length n from x_0 to x_n.
- If x_0=x_n, then we will call it a cycle or circuit.
- In the first example graph above, there is one path from a to d.
- In the second, there are three: a,b,c,d and a,b,d and a,c,d.
- Note that we haven't said anything about the x_k being unique in our paths.
- Paths are allowed to cross themselves, or even use the same edges multiple times.
- For example, in this graph, to get from a to d, there are paths of length 3, 5, 7, 9, 11,…
Theorem: Let R be a relation on A. There is a path from a to b of length n in the graph of R iff (a,b)\in R^n.
Proof: For n=1, the path must be a single edge. Such a path will exist if and only if (a,b)\in R.
Now assume for induction that the theorem is true for paths of length n with n\ge 1. If there is a path of length n+1 from a to b, then let c be the last vertex in that path, so (c,b) is the last edge, and the remainder of the path is length n from a to c. Thus (a,c)\in R^n and by definition of relation composition, (a,b)\in R^{n+1}.
If (a,b)\in R^{n+1}, then there is a c such that (a,c)\in R^{n} and (c,b)\in R. From the inductive hypothesis, there is a path of length n from a to c and adding (c,b) gives a path of length n+1 from a to b.
So, we have a path of length n+1 iff (a,b)\in R^{n+1}, and by induction for all n\ge 1, we have a path of length n iff (a,b)\in R^n.∎
- For a relation R, we will define R^{*} as the connectivity relation as the relation of pairs (a,b) such that you can get from a to b in R using a path of any length.
- That is, R^{*} = \bigcup_{n=1}^{\infty} R^n.
- In the flights-between-cities example, finding R^2, R^3, etc told us something useful.
- R^n was the cities you can get between in exactly n flights.
- We have now proved that is really true in the above theorem.
- It's then perfectly reasonable to ask what cities we can get between in any number of flights.
- R^{*} gives us a notation for that (and formal definition of what we're talking about).
- If (a,b)\in R^{*}, then it is possible to take a series of flights to get from a to b (without taking the bus in between or something).
Lemma: For a transitive relation R, all R^n are transitive.
Proof: Consider any (a,b),(b,c)\in R^n. From the above theorem, there is a path of length n from a to b and from b to c in R. Then we have a path of length 2n from a to c in R.
Since R is transitive, we can replace the edges in this path with their two-step counterparts (they must exist since R is transitive). The n=3 case would look like this:
This constructs a path of length n from a to c. Thus, (a,c)\in R^n and R^n is transitive.∎
Theorem: The transitive closure of a relation R is R^{*}.
Proof: In order for R^{*} to be the transitive closure, it must contain R, be transitive, and be a subset of in any transitive relation that contains R. By the definition of R^{*}, it contains R.
If there are (a,b),(b,c)\in R^{*}, then there are j and k such that (a,b)\in R^j and (b,c)\in R^k. Then (a,c)\in R^{j+k}\subseteq R^{*}. Thus, R^{*} is transitive.
Now suppose we have a transitive relation S with R\subseteq S. Since S is transitive, S^n is transitive for all n (from the lemma) and then from an earlier theorem, S^n\subseteq S. It then follows that S^{*}=\bigcup_{n=1}^{\infty} S^n\subseteq S.
Since R\subseteq S, we know that any path in R must also be a path in S so R^{*}\subseteq S^{*}. Thus, R^{*}\subseteq S^{*}\subseteq S. So, R^{*} is a subset of any transitive relation that contains R and it meets all of the criteria required to be the transitive closure.∎
Corollary: If R is a transitive relation, the transitive closure of R is R.
Proof: From the earlier theorem, if R is transitive, then R^n\subseteq R for all n. Thus R^{*}=\bigcup_{n=1}^{\infty} R^n\subseteq R. We also obviously have R\subseteq R^{*}, so R=R^{*}.∎
- Example: We had a relation with this matrix earlier:
\mathbf{M}_R=\left[\begin{smallmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{smallmatrix}\right]
- We found that the matrix of R^2 was: \mathbf{M}_{R^2}=\left[\begin{smallmatrix}1&1&1&0 \\ 1&1&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{smallmatrix}\right]
- This is also the transitive closure: R\subseteq R^2, and if we continue to compose with R the matrix doesn't change: {R^2}={R^3}={R^4}=\cdots.
- Our “plus one” relation over the integers was R=\{(a,a+1)\mid a\in\mathbf{Z}\}.
- It's not transitive, so R^* might be interesting.
- We established that R^2 was the relation containing all pairs (a,a+2).
- Similarly, R^k is the set of (a,a+k) for each integer a.
- The union of all of these is R^{*}=\{(a,b)\mid a<b\}.
- That is, it is the relationship “less than” over the integers.
- That's a little surprising.
Calculating the Transitive Closure
- The above theorems give us a method to find the transitive closure of a relation.
- Unfortunately, since it's a union of infinitely many things, it's not exactly practical to compute.
- But it turns out that we don't actually need to compute an infinite number of R^n to get the transitive closure (of a finite set).
Theorem: Let R be a relation on A with |A|=n. If there is a path from a to b in R, then there is a path with at most n steps.
Proof: Suppose there is a path from a to b in R. Let m be the length of the shortest such path. Assume for contradiction that m> n.
Since there are only n vertexes in the graph of R, by the pigeonhole principle, the path from a to b must pass by at least one of them twice. That is, there must be a loop in the graph:
We can form a shorter path from a to b by simply removing the loop. This contradicts the assumption that the path was the shortest. Thus, m\le n.∎
- This theorem tells us that if we calculate R,R^2,R^3,\ldots,R^n then we have found every connection within R.
- There may be more paths of length >n, but they connect elements that were already connected by some other path.
- Thus, for a relation on n elements, the transitive closure of R is \bigcup_{k=1}^{n} R^k.
- We can finally write an algorithm to compute the transitive closure of a relation that will complete in a finite amount of time.
- Let's assume we're representing our relation as a matrix as described earlier.
- When we use R in the algorithm, we mean \mathbf{M}_R: the n\times n matrix described earlier.
- \odot denotes the operation to compute the composition of two relations.
- Calculating one entry in the matrix for A\odot B requires computing (a_{i1}\wedge a_{1j})\vee (a_{i2}\wedge a_{2j})\vee\cdots \vee (a_{in}\wedge a_{nj}).
- So computing one entry takes \Theta(n) steps, and calculating the whole matrix for a composition takes \Theta(n^3).
- Our algorithm is:
procedure transitive_closure(R) C = R P = R for k from 2 to n: P = P ⊙ R # P = R^k C = C ∨ P # C = R ∪ R^2 ∪ ... ∪ R^k return C
- Since we do the \odot operation n times here, this algorithm has running time \Theta(n^4).
- That's pretty bad, but the best I could do.
- Fortunately, there are people smarter than me.
Warshall's Algorithm
- It turns out that we can do better than \Theta(n^4) to find the transitive closure.
- Consider a relation R on a set A with |A|=n.
- Label the elements of A as a_1,a_2,\ldots,a_n.
- If there is a path from a to b in that relation then, then…
- The path will be something like a,i_1,i_2,\ldots,i_k,b, where (a,i_1),(i_1,i_2),\ldots,(i_k,b)\in R.
- We will call the i_j interior vertices of that path.
- That is, things in the path that aren't the endpoints.
- We can now look at a relation and ask if there is a path from a to b using only a particular set of interior vertices.
- In particular, we are interested in path that use only the first k elements of A as interior vertices.
- There is a path from a to b using only a_1,a_2\ldots,a_k as interior vertices if either:
- There is a path using a_1,a_2\ldots,a_{k-1}, or
- There is a path that uses a_k: a,\ldots,a_k,\ldots,b.
- We can create an algorithm using this.
- Let \mathbf{W}_k be the binary matrix where there is a 1 in entry i,j if you can get from a_i to a_j using only interior vertices a_1,\ldots,a_k.
- We'll call the i,j element of this matrix w_{ij}^{(k)}.
- \mathbf{M}_R (the matrix of the relation) is the matrix of edges in the relation
- It is the elements you can get between using no interior vertices.
- That is, \mathbf{M}_R=\mathbf{W}_0.
- If we can find \mathbf{W}_n, it is the transitive closure.
- It gives the nodes we can get between using any combination of interior vertices.
- From the above, w_{ij}^{(k)} is one if either:
- w_{ij}^{(k-1)} is one, or
- w_{ik}^{(k-1)} \wedge w_{j,k}^{(k-1)} is one.
- Now, it's easy enough to find \mathbf{W}_n. This is the Warshall Algorithm to find the transitive closure:
procedure warshall(R): W = R for k from 1 to n for i from 1 to n for j from 1 to n w[i,j] = w[i,j] ∨ (w[i,k] ∧ w[k,j]) return W
- At the end of each iteration of the “
for k
”,W
is \mathbf{W}_k - [Actually, it might be somewhere between \mathbf{W}_k and \mathbf{W}_{k+1} since we might overwrite other zeros with ones as we process the matrix.]
- At the end of each iteration of the “
- The running time here is pretty obvious: \Theta(n^3).
- Our naïve algorithm above was \Theta(n^4).
- A pretty good improvement.
- Let's try it with an example relation: \mathbf{M}_R=\left[\begin{smallmatrix}1&1&0&0 \\0&0&1&0 \\0&0&1&1 \\0&0&0&0 \end{smallmatrix}\right]
- Aside: with fairly minor modifications, the Warshall Algorithm can find the shortest path, not just decide whether one exists or not.
- Not very useful for transitive closure, but useful as we move on to graph theory.