Relations
- A binary relation from set A to set B is a subset of A\times B.
- e.g. if we use A=\{1,2,3,4\} and B=\{a,b,c\}, we could define a relation R=\{(1,a),(1,c),(2,b),(3,b)\}.
- Then we would say that (1,c) has this relation, of that 1 is related to c by R.
- It is also common to write “1 \mathop{R}c” to indicate that two elements are related, or “2 \mathop{\not R}c” to indicate that they aren't.
- More realistic example: “less than” (<) is a relation on the real numbers.
- The set of values in the relation are \{(i,j)\mid i,j\in \mathbf{R}, i<j\}\subseteq \mathbf{R}\times \mathbf{R} .
- The 1< 2 and 2\not < 1 notation makes a lot more sense for the < symbol.
- For consistency, we would also have to allow ourselves to write this, even though it looks insane: (\text{<}) = \{(i,j)\mid i,j\in \mathbf{R}, i<j\}
- I'd probably rather make up some notation like R_{<} = \{\cdots\} if necessary.
- We could look a functions as a particular kind of relation.
- A function is a relation where we're only allowed to have one “result”.
- A relation R is a function if (a,b_1),(a,b_2)\in R implies b_1=b_2.
- For a function f:A\rightarrow B, we could define it as a relation f = \{(a,b)\mid a\in A, b=f(a)\}.
- It's fairly common to use a relation from a set to itself: a subset of A\times A.
- Call this “a relation on the set A”.
- e.g. less-than is a relation on \mathbf{R}.
- e.g. “divides” is a relation on the integers we defined earlier.
- We defined it as a\mid b if there is an integer c such that b=ac.
- Or we could now think of the set \{(a,b)\mid a,b\in\mathbf{Z}, b=ac\text{ for an integer }c\}.
Properties of Relations
- Just having “a subset of the Cartesian product” isn't very interesting.
- Relations that have some particular properties are common, and can be more useful.
- We're generally concerned about relations on a particular set here: from a set to itself.
- A relation R on the set A is reflexive if (a,a)\in R for all a\in A.
- That is, every element is related to itself.
- For example \le is reflexive, but < is not.
- The relation “divides” is reflexive since every integer divides itself.
- A relations is irreflexive if (a,a)\not\in R for all a\in A.
- The relation < is irreflexive.
- A relation is symmetric if (a,b)\in R implies that (b,a)\in R.
- A relation is antisymmetric if (a,b)\in R and (b,a)\in R only when a=b.
- For example, <, \le, and divisibility are all antisymmetric.
- The “equals” (=) relation is symmetric.
- Congruence modulo k is symmetric. That is, a and b are related iff a\equiv b\text{ (mod }k\text{)} is symmetric.
- Some relations are neither symmetric nor antisymmetric: the relation \{(1,1),(1,2)\} is neither.
- … it's also neither reflexive nor irreflexive.
- A relation is transitive if whenever (a,b)\in R and (b,c)\in R, we then have (a,c)\in R.
- \le and < are both transitive.
- So is divisibility: if a\mid b and b\mid c then there are integers so that b=db and c=eb so c=(de)a and a\mid c.
- A relation that isn't transitive: a\mathop{R}b iff b=a+1. Then 2\mathop{R}3 and 3\mathop{R}4 but 2\mathop{\not R}3
- Inequality (\ne) isn't transitive: 4\ne 5 and 5\ne 4 but 4=4.
- Example: set S be the set of cities with airports, and a\mathop{R}b if there is a direct flight from a to b.
- R is definitely not reflexive. It's probably irreflexive since no airline would fly from a city and back (without stopping). [Unless you count sightseeing flights or something.]
- R is symmetric: if there's a flight from Shanghai to Vancouver, then there's a flight from Vancouver to Shanghai. [Right? I don't know of any counterexamples.]
- R is not transitive: there is a flight from Hangzhou to Beijing and Beijing to Vancouver, but none from Hangzhou to Vancouver.
- If we have a finite set A with |A|=n, how many relations are there?
- There are 2^{n^2} relations on A since there are n^2 elements of A\times A, and any subset is a relation.
- How many are reflexive?
- Of the n^2 elements of A\times A, we must take the n values (a,a) in the subset. The others can be chosen freely.
- So, 2^{n^2-n}.
- How many are symmetric?
- If we can decide on an order for the n elements, we can choose the “smaller” elements first (like (a,b)) and that implies the other order must also be included (like (b,a)).
- So, how many pairs (a,b) can we choose where a\le b? There are n(n+1)/2. We choose from those, then since the relation is symmetric, we know we must include/exclude (b,a).
- So, 2^{n(n+1)/2} relations.
- How many are transitive?
- No known formula. (A006905)
Combining Relations
- Since relations are just subsets of a Cartesian product, we can combine them with regular set operations.
- For example, if we take R_{<}, R_{>}, R_{\le}, R_{\ge} to be the sets for relations less-than, greater-than, etc., then
- R_{\le}\cap R_{\ge} is the equality relation, R_{=}.
- R_{<}\cup R_{>} is the inequality relation, R_{\ne}.
- R_{\le}-R_{=}=R_{<}.
- Or we could use an operation we used on functions and compose relations.
- That is, we we have two relations R_1\subseteq A\times B and R_2\subseteq B\times C, then we can define A\circ B with elements (a,c) where there is a b\in B such that (a,b)\in R_1 and (b,c)\in R_2.
- For example, the “plus one” relation that we had earlier: a\mathop{R}b iff b=a+1.
- If we compose it with “greater than or equal” over the integers, R\circ R_{\ge}, we get R_{>} since m\ge n iff m+1 > n for integers m,n.
- If we compose it with itself, R\circ R, we get “plus two”: a\mathop{(R\circ R)}b iff b=a+2.
- We can also continue to compose a relation with itself.
- We can define R^n as R^1=R and R^{n+1}=R^n\circ R.
- So R^3=(R\circ R)\circ R and R^4=((R\circ R)\circ R)\circ R.
- For the “flights between cities” relation…
- R^2 is the cities you can get to in exactly two flights. So, (\text{Hangzhou},\text{Vancouver})\in R^2 (or we could write \text{Hangzhou} \mathop{R^2} \text{Vancouver}, but that's awkward).
- (\text{Hangzhou},\text{Shanghai})\in R^2 since (\text{Hangzhou},\text{Beijing}),(\text{Beijing},\text{Shanghai})\in R^2. We haven't said anything about needing two flights.
- R^2-R is the relation where you must take two flights (there is no direct route).
- R\cup R^2 \cup R^3\cup\cdots are cities that you can fly between, taking any number of flights.
Theorem: A relation on a set is transitive if and only if R^n\subseteq R for all n\ge 1.
Proof: First assume that R^n\subseteq R for all n\ge 1 (to show the “if” part). If we have (a,b),(b,c)\in R then (a,c)\in R^2. From the assumption, (a,c)\in R, so R is transitive.
Now assume that R is transitive. For n=1, R^1\subseteq R by definition. We can show that R^n\subseteq R for larger n by induction.
Assume for induction that R^n\subseteq R. If we have an element (a,b)\in R^{n+1}=R^n\circ R then there is a c such that (a,c)\in R^{n}\subseteq R and (c,b)\in R. Since R is transitive and we have two elements of R, we know that (a,b)\in R. Thus R^{n+1}\subseteq R.
By induction, R^{n}\subseteq R for all n.∎
- The inverse relation of a relation R is the set R^{-1}=\{(b,a)\mid (a,b)\in R\}.
- For example, the inverse of “less than” is “greater than”, since a<b iff b>a.
Theorem: A relation is symmetric iff R=R^{-1}.
Proof: First consider a symmetric relation R and any (a,b)\in R. Since R is symmetric, we know that (b,a)\in R. Thus, we have R\subseteq R^{-1}.
Now take any (a,b)\in R^{-1}. By definition, (b,a)\in R and by symmetry, (a,b)\in R. So, R^{-1}\subseteq R and we have R=R^{-1}.
Finally, suppose R=R^{-1}. For any (a,b)\in R, we know (b,a)\in R^{-1}=R. So, R is symmetric.∎
Relation-Like Things
- In a lot of ways, relations aren't really new concepts.
- They're like a generalization of functions (as mentioned before).
- We just remove the restriction that a value must map onto a unique image.
- Relations are a lot like predicates with two arguments.
- We could have defined a predicate P(a,b) to be true iff a<b.
- Or we could define a relation that contains all (a,b) where a<b.
- The predicates could be combined with logical operators (\vee,\wedge,\ldots), and the relations with set operators (\cup,\cap,\ldots).
- The results are fundamentally the same, but one or the other might be easier notation to work with.
- In particular, things like “less than” are usually thought of as relations, so there are plenty of theorems you'll find using that notation.
- One more way to think of these things: as Boolean functions.
- That is, functions in a programming language that return true or false.
- e.g. a Java method:
public boolean lessthan(int a, int b) { return a<b; }
- e.g. in Haskell you can actually define a new operator that works like a relation:
(<) :: Int -> Int -> Bool (<) a b = not (a>b) 5 < 6 -- uses the above code to get a result
Matrices for Relations
- If we have a relation on a (relatively small) finite set, there are a few ways we can represent them that might be helpful for visualizing them or reasoning about them.
- Suppose we have a relation R from A=\{a_1,a_2,\ldots,a_m\} to B=\{b_1,b_2,\ldots,b_n\}.
- We can represent this as an m\times n matrix, which we'll usually refer to as \mathbf{M}_R.
- The i,j entry of the matrix will be 1 if (a_i,b_j)\in R and 0 otherwise.
- This gives us an efficient way to represent arbitrary relations in a computer: we need mn bits to store everything we need.
- For example, if we have A=\{a,b,c\} and B=\{x,y\} and a relation R=\{(a,x),(a,y),(b,x),(c,y)\}, then we will represent it with \mathbf{M}_R=\begin{bmatrix}1 & 1\\1 & 0\\0 & 1\end{bmatrix}
- Note that this requires an order for the elements of the sets.
- The order doesn't matter, as long as we pick one and stay with it.
- For a relation on a set A=\{a_1,a_2,\ldots,a_n\}, we will have an n\times n matrix.
- A reflexive relation will have 1s down the diagonal. For example the “equals” relation on the set A=\{a,b,c\} is R=\{(a,a),(b,b),(c,c)\} and has the matrix \mathbf{M}_R=\begin{bmatrix}1 & 0& 0\\0 & 1& 0\\0 & 0& 1\end{bmatrix}
- A symmetric relation will have a 1 in position i,j iff there is a 1 in j,i.
- So, it is a mirror image across the diagonal.
- In other words, it will be its own transpose: \mathbf{M}_R=(\mathbf{M}_R)^t.
- We can calculate the composition of relations from their matrix as well.
- Recall that R\circ S has elements (a,c) where there is a b such that (a,b)\in R and (b,c)\in S.
- So, element (i,j) of the matrix \mathbf{M}_{R\circ S} will be one when there is a k where (i,k) and (k,j) are both one.
- That is, when (a_{i1}\wedge a_{1j})\vee (a_{i2}\wedge a_{2j})\vee\cdots \vee (a_{in}\wedge a_{nj}).
- The textbook uses the notation \mathbf{M}_{R\circ S}=\mathbf{M}_{R}\odot \mathbf{M}_{S} for this operation.
- For example, suppose we have a relation R on \{a,b,c,d\} with this matrix
\mathbf{M}_R=\begin{bmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}
- We can tell that the relation isn't reflexive: the 2,2 entry is missing.
- It is symmetric: transposing it would not change it.
- The relation R^2 has the matrix \mathbf{M}_{R^2}=\begin{bmatrix}1&1&1&0 \\ 1&1&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}
- That is also the same as \mathbf{M}_{R^3}, \mathbf{M}_{R^4}, …
Digraphs for Relations
- A directed graph or digraph is a set V of vertices (or nodes, singular is “vertex”) and E of edges that connect vertices.
- If we have vertices a and b, then an edge that connects a to b is (a,b).
- The edges have a direction: (a,b)\ne(b,a).
- That's why it's a directed graph.
- We usually draw digraphs like this:
- Each dot is a vertex.
- Each arrow is an edge.
- A digraph can also represent a relation.
- Vertices represent the elements of the set(s).
- Edges represent the elements of the relation.
- The above digraph is the same relation as the matrix example: \mathbf{M}_R=\left[\begin{smallmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{smallmatrix}\right]
- From the digraph, we can see a few things easily:
- If a relation is reflexive, there will be a loop on each vertex.
- If it's irreflexive, there will be none.
- If it is symmetric, each arrow will be accompanied by another in the opposite direction (like between b and c above).
- It's antisymmetric if that never happens.
- If it is transitive, every pair of adjoining edges will also have another that does the same thing in one step, like this:
- The inverse of a relation can be found by just reversing each arrow.
- For a relation that's not on a single set (is a subset of A\times B with A\ne B), we can still draw the graph:
- That's probably not as interesting as for relations in A\times A.
n-ary Relations
- We introduced out subsets of A\times B as “binary relations“. That should be a hint that there's some other kind of relations.
- For sets A_1,A_2,\ldots,A_n, an n-ary relation on those sets is a subset of A_1\times A_2\times \cdots\times A_n.
- For example, we could have a relation on \mathbf{Z}\times\mathbf{Z}\times\mathbf{Z}^{+} consisting of triples (a,b,m) where a\equiv b\ (\text{mod } m).
- Then (3,18,5) and (4,-5,9) are in this relation, but (1,2,5) and (9,100,10) are not.
- We could also think of rows in a database as forming a relation.
- For example, a database table might have the following data in each row: student number, name, email address.
- Then the table basically defines a relation on \text{integer}\times \text{strings}\times \text{strings}.
- For an n-ary relation R, let C be some condition on the elements of R. A selection operation maps R to the subset of R that satisfy the condition.
- For example, we could select form the above relation all people who have an “@zju.edu.cn” email address.
- Or select all entries with student number 3110012345.
- A projection P_{i_1,i_2,\ldots,i_m} maps the element (a_1,a_2,\ldots,a_n)\in R to (a_{i_1}, a_{i_2},\ldots,a_{1_m}).
- That is, it selects only some of the original sets from the relation.
- If we only care about student number and email, records like (3110012345, "Student Name", "student@example.com") would be projected to (3110012345, "student@example.com").
- That is the projection P_{1,3} on the relation.
- In database terminology, we select only certain columns.
- A join on two relations is an operation that combines values based on similar data in the two relations.
- For example, suppose we have another relation on student number, course, and grade.
- We could join the student number, name, email address relation to this on the student number.
- The result would be a list of all possible combinations where the student numbers match in both relations.
- We would get a relation with values for student number, name, email address, course, and grade.
- Querying and manipulating databases of such relations is a problem for another course.
- Several courses, really.
- We would like to be able to ask questions like “which elements of the relation have some particular property?” and get the answers quickly (O(\log n) time or better).
- SQL (Structured Query Language) gives us syntax to perform the operations described above.
- … as well as to manipulate the data (insert, update, delete) in the relation.