# Relations

- A binary relation from set \(A\) to set \(B\) is a subset of \(A\times B\).
- e.g. if we use \(A=\{1,2,3,4\}\) and \(B=\{a,b,c\}\), we could define a relation \(R=\{(1,a),(1,c),(2,b),(3,b)\}\).
- Then we would say that \((1,c)\) has this relation, of that \(1\) is related to \(c\) by \(R\).
- It is also common to write “\(1 \mathop{R}c\)” to indicate that two elements are related, or “\(2 \mathop{\not R}c\)” to indicate that they aren't.

- More realistic example: “less than” (\(<\)) is a relation on the real numbers.
- The set of values in the relation are \(\{(i,j)\mid i,j\in \mathbf{R}, i<j\}\subseteq \mathbf{R}\times \mathbf{R} \).
- The \(1< 2\) and \(2\not < 1\) notation makes a lot more sense for the \(<\) symbol.
- For consistency, we would also have to allow ourselves to write this, even though it looks insane: \[(\text{<}) = \{(i,j)\mid i,j\in \mathbf{R}, i<j\}\]
- I'd probably rather make up some notation like \(R_{<} = \{\cdots\}\) if necessary.

- We could look a functions as a particular kind of relation.
- A function is a relation where we're only allowed to have one “result”.
- A relation \(R\) is a function if \((a,b_1),(a,b_2)\in R\) implies \(b_1=b_2\).
- For a function \(f:A\rightarrow B\), we could define it as a relation \(f = \{(a,b)\mid a\in A, b=f(a)\}\).

- It's fairly common to use a relation from a set to itself: a subset of \(A\times A\).
- Call this “a relation on the set \(A\)”.
- e.g. less-than is a relation on \(\mathbf{R}\).

- e.g. “divides” is a relation on the integers we defined earlier.
- We defined it as \(a\mid b\) if there is an integer \(c\) such that \(b=ac\).
- Or we could now think of the set \(\{(a,b)\mid a,b\in\mathbf{Z}, b=ac\text{ for an integer }c\}\).

# Properties of Relations

- Just having “a subset of the Cartesian product” isn't very interesting.
- Relations that have some particular properties are common, and can be more useful.
- We're generally concerned about relations on a particular set here: from a set to itself.

- A relation \(R\) on the set \(A\) is reflexive if \((a,a)\in R\) for all \(a\in A\).
- That is, every element is related to itself.
- For example \(\le\) is reflexive, but \(<\) is not.
- The relation “divides” is reflexive since every integer divides itself.
- A relations is irreflexive if \((a,a)\not\in R\) for all \(a\in A\).
- The relation \(<\) is irreflexive.

- A relation is symmetric if \((a,b)\in R\) implies that \((b,a)\in R\).
- A relation is antisymmetric if \((a,b)\in R\) and \((b,a)\in R\) only when \(a=b\).
- For example, \(<\), \(\le\), and divisibility are all antisymmetric.
- The “equals” (\(=\)) relation is symmetric.
- Congruence modulo \(k\) is symmetric. That is, \(a\) and \(b\) are related iff \(a\equiv b\text{ (mod }k\text{)}\) is symmetric.
- Some relations are neither symmetric nor antisymmetric: the relation \(\{(1,1),(1,2)\}\) is neither.
- … it's also neither reflexive nor irreflexive.

- A relation is transitive if whenever \((a,b)\in R\) and \((b,c)\in R\), we then have \((a,c)\in R\).
- \(\le\) and \(<\) are both transitive.
- So is divisibility: if \(a\mid b\) and \(b\mid c\) then there are integers so that \(b=db\) and \(c=eb\) so \(c=(de)a\) and \(a\mid c\).
- A relation that isn't transitive: \(a\mathop{R}b\) iff \(b=a+1\). Then \(2\mathop{R}3\) and \(3\mathop{R}4\) but \(2\mathop{\not R}3\)
- Inequality (\(\ne\)) isn't transitive: \(4\ne 5\) and \(5\ne 4\) but \(4=4\).

- Example: set \(S\) be the set of cities with airports, and \(a\mathop{R}b\) if there is a direct flight from \(a\) to \(b\).
- \(R\) is definitely not reflexive. It's probably irreflexive since no airline would fly from a city and back (without stopping). [Unless you count sightseeing flights or something.]
- \(R\) is symmetric: if there's a flight from Shanghai to Vancouver, then there's a flight from Vancouver to Shanghai. [Right? I don't know of any counterexamples.]
- \(R\) is not transitive: there is a flight from Hangzhou to Beijing and Beijing to Vancouver, but none from Hangzhou to Vancouver.

- If we have a finite set \(A\) with \(|A|=n\), how many relations are there?
- There are \(2^{n^2}\) relations on \(A\) since there are \(n^2\) elements of \(A\times A\), and any subset is a relation.
- How many are reflexive?
- Of the \(n^2\) elements of \(A\times A\), we must take the \(n\) values \((a,a)\) in the subset. The others can be chosen freely.
- So, \(2^{n^2-n}\).

- How many are symmetric?
- If we can decide on an order for the \(n\) elements, we can choose the “smaller” elements first (like \((a,b)\)) and that implies the other order must also be included (like \((b,a)\)).
- So, how many pairs \((a,b)\) can we choose where \(a\le b\)? There are \(n(n+1)/2\). We choose from those, then since the relation is symmetric, we know we must include/exclude \((b,a)\).
- So, \(2^{n(n+1)/2}\) relations.

- How many are transitive?
- No known formula. (A006905)

# Combining Relations

- Since relations are just subsets of a Cartesian product, we can combine them with regular set operations.
- For example, if we take \(R_{<}\), \(R_{>}\), \(R_{\le}\), \(R_{\ge}\) to be the sets for relations less-than, greater-than, etc., then
- \(R_{\le}\cap R_{\ge}\) is the equality relation, \(R_{=}\).
- \(R_{<}\cup R_{>}\) is the inequality relation, \(R_{\ne}\).
- \(R_{\le}-R_{=}=R_{<}\).

- Or we could use an operation we used on functions and compose relations.
- That is, we we have two relations \(R_1\subseteq A\times B\) and \(R_2\subseteq B\times C\), then we can define \(A\circ B\) with elements \((a,c)\) where there is a \(b\in B\) such that \((a,b)\in R_1\) and \((b,c)\in R_2\).

- For example, the “plus one” relation that we had earlier: \(a\mathop{R}b\) iff \(b=a+1\).
- If we compose it with “greater than or equal” over the integers, \(R\circ R_{\ge}\), we get \(R_{>}\) since \(m\ge n\) iff \(m+1 > n\) for integers \(m,n\).
- If we compose it with itself, \(R\circ R\), we get “plus two”: \(a\mathop{(R\circ R)}b\) iff \(b=a+2\).

- We can also continue to compose a relation with itself.
- We can define \(R^n\) as \(R^1=R\) and \(R^{n+1}=R^n\circ R\).
- So \(R^3=(R\circ R)\circ R\) and \(R^4=((R\circ R)\circ R)\circ R\).

- For the “flights between cities” relation…
- \(R^2\) is the cities you can get to in exactly two flights. So, \((\text{Hangzhou},\text{Vancouver})\in R^2\) (or we could write \(\text{Hangzhou} \mathop{R^2} \text{Vancouver}\), but that's awkward).
- \((\text{Hangzhou},\text{Shanghai})\in R^2\) since \((\text{Hangzhou},\text{Beijing}),(\text{Beijing},\text{Shanghai})\in R^2\). We haven't said anything about
*needing*two flights. - \(R^2-R\) is the relation where you
*must take*two flights (there is no direct route). - \(R\cup R^2 \cup R^3\cup\cdots\) are cities that you can fly between, taking any number of flights.

*Theorem:*A relation on a set is transitive if and only if \(R^n\subseteq R\) for all \(n\ge 1\).*Proof:*First assume that \(R^n\subseteq R\) for all \(n\ge 1\) (to show the “if” part). If we have \((a,b),(b,c)\in R\) then \((a,c)\in R^2\). From the assumption, \((a,c)\in R\), so \(R\) is transitive.Now assume that \(R\) is transitive. For \(n=1\), \(R^1\subseteq R\) by definition. We can show that \(R^n\subseteq R\) for larger \(n\) by induction.

Assume for induction that \(R^n\subseteq R\). If we have an element \((a,b)\in R^{n+1}=R^n\circ R\) then there is a \(c\) such that \((a,c)\in R^{n}\subseteq R\) and \((c,b)\in R\). Since \(R\) is transitive and we have two elements of \(R\), we know that \((a,b)\in R\). Thus \(R^{n+1}\subseteq R\).

By induction, \(R^{n}\subseteq R\) for all \(n\).∎

- The inverse relation of a relation \(R\) is the set \(R^{-1}=\{(b,a)\mid (a,b)\in R\}\).
- For example, the inverse of “less than” is “greater than”, since \(a<b\) iff \(b>a\).

*Theorem:*A relation is symmetric iff \(R=R^{-1}\).*Proof:*First consider a symmetric relation \(R\) and any \((a,b)\in R\). Since \(R\) is symmetric, we know that \((b,a)\in R\). Thus, we have \(R\subseteq R^{-1}\).Now take any \((a,b)\in R^{-1}\). By definition, \((b,a)\in R\) and by symmetry, \((a,b)\in R\). So, \(R^{-1}\subseteq R\) and we have \(R=R^{-1}\).

Finally, suppose \(R=R^{-1}\). For any \((a,b)\in R\), we know \((b,a)\in R^{-1}=R\). So, \(R\) is symmetric.∎

# Relation-Like Things

- In a lot of ways, relations aren't really new concepts.
- They're like a generalization of functions (as mentioned before).
- We just remove the restriction that a value must map onto a unique image.

- Relations are a lot like predicates with two arguments.
- We could have defined a predicate \(P(a,b)\) to be true iff \(a<b\).
- Or we could define a relation that contains all \((a,b)\) where \(a<b\).
- The predicates could be combined with logical operators (\(\vee,\wedge,\ldots\)), and the relations with set operators (\(\cup,\cap,\ldots\)).
- The results are fundamentally the same, but one or the other might be easier notation to work with.
- In particular, things like “less than” are usually thought of as relations, so there are plenty of theorems you'll find using that notation.

- One more way to think of these things: as Boolean functions.
- That is, functions in a programming language that return true or false.
- e.g. a Java method:
`public boolean lessthan(int a, int b) { return a<b; }`

- e.g. in Haskell you can actually define a new operator that works like a relation:
`(<) :: Int -> Int -> Bool (<) a b = not (a>b) 5 < 6 -- uses the above code to get a result`

# Matrices for Relations

- If we have a relation on a (relatively small) finite set, there are a few ways we can represent them that might be helpful for visualizing them or reasoning about them.
- Suppose we have a relation \(R\) from \(A=\{a_1,a_2,\ldots,a_m\}\) to \(B=\{b_1,b_2,\ldots,b_n\}\).
- We can represent this as an \(m\times n\) matrix, which we'll usually refer to as \(\mathbf{M}_R\).
- The \(i,j\) entry of the matrix will be 1 if \((a_i,b_j)\in R\) and 0 otherwise.
- This gives us an efficient way to represent arbitrary relations in a computer: we need \(mn\) bits to store everything we need.

- For example, if we have \(A=\{a,b,c\}\) and \(B=\{x,y\}\) and a relation \(R=\{(a,x),(a,y),(b,x),(c,y)\}\), then we will represent it with \[\mathbf{M}_R=\begin{bmatrix}1 & 1\\1 & 0\\0 & 1\end{bmatrix}\]
- Note that this requires an order for the elements of the sets.
- The order doesn't matter, as long as we pick one and stay with it.

- For a relation on a set \(A=\{a_1,a_2,\ldots,a_n\}\), we will have an \(n\times n\) matrix.
- A reflexive relation will have 1s down the diagonal. For example the “equals” relation on the set \(A=\{a,b,c\}\) is \(R=\{(a,a),(b,b),(c,c)\}\) and has the matrix \[\mathbf{M}_R=\begin{bmatrix}1 & 0& 0\\0 & 1& 0\\0 & 0& 1\end{bmatrix}\]
- A symmetric relation will have a 1 in position \(i,j\) iff there is a 1 in \(j,i\).
- So, it is a mirror image across the diagonal.
- In other words, it will be its own transpose: \(\mathbf{M}_R=(\mathbf{M}_R)^t\).

- We can calculate the composition of relations from their matrix as well.
- Recall that \(R\circ S\) has elements \((a,c)\) where there is a \(b\) such that \((a,b)\in R\) and \((b,c)\in S\).
- So, element \((i,j)\) of the matrix \(\mathbf{M}_{R\circ S}\) will be one when there is a \(k\) where \((i,k)\) and \((k,j)\) are both one.
- That is, when \((a_{i1}\wedge a_{1j})\vee (a_{i2}\wedge a_{2j})\vee\cdots \vee (a_{in}\wedge a_{nj})\).
- The textbook uses the notation \(\mathbf{M}_{R\circ S}=\mathbf{M}_{R}\odot \mathbf{M}_{S}\) for this operation.

- For example, suppose we have a relation \(R\) on \(\{a,b,c,d\}\) with this matrix
\[\mathbf{M}_R=\begin{bmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}\]
- We can tell that the relation isn't reflexive: the \(2,2\) entry is missing.
- It is symmetric: transposing it would not change it.
- The relation \(R^2\) has the matrix \[\mathbf{M}_{R^2}=\begin{bmatrix}1&1&1&0 \\ 1&1&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}\]
- That is also the same as \(\mathbf{M}_{R^3}\), \(\mathbf{M}_{R^4}\), …

# Digraphs for Relations

- A directed graph or digraph is a set \(V\) of vertices (or nodes, singular is “vertex”) and \(E\) of edges that connect vertices.
- If we have vertices \(a\) and \(b\), then an edge that connects \(a\) to \(b\) is \((a,b)\).
- The edges have a direction: \((a,b)\ne(b,a)\).
- That's why it's a
**directed**graph.

- We usually draw digraphs like this:
- Each dot is a vertex.
- Each arrow is an edge.

- A digraph can also represent a relation.
- Vertices represent the elements of the set(s).
- Edges represent the elements of the relation.
- The above digraph is the same relation as the matrix example: \[\mathbf{M}_R=\left[\begin{smallmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{smallmatrix}\right]\]

- From the digraph, we can see a few things easily:
- If a relation is reflexive, there will be a loop on each vertex.
- If it's irreflexive, there will be none.
- If it is symmetric, each arrow will be accompanied by another in the opposite direction (like between \(b\) and \(c\) above).
- It's antisymmetric if that never happens.
- If it is transitive, every pair of adjoining edges will also have another that does the same thing in one step, like this:
- The inverse of a relation can be found by just reversing each arrow.

- For a relation that's not on a single set (is a subset of \(A\times B\) with \(A\ne B\)), we can still draw the graph:
- That's probably not as interesting as for relations in \(A\times A\).

# n-ary Relations

- We introduced out subsets of \(A\times B\) as “binary relations“. That should be a hint that there's some other kind of relations.
- For sets \(A_1,A_2,\ldots,A_n\), an
*n*-ary relation on those sets is a subset of \(A_1\times A_2\times \cdots\times A_n\).- For example, we could have a relation on \(\mathbf{Z}\times\mathbf{Z}\times\mathbf{Z}^{+}\) consisting of triples \((a,b,m)\) where \(a\equiv b\ (\text{mod } m)\).
- Then \((3,18,5)\) and \((4,-5,9)\) are in this relation, but \((1,2,5)\) and \((9,100,10)\) are not.

- We could also think of rows in a database as forming a relation.
- For example, a database table might have the following data in each row: student number, name, email address.
- Then the table basically defines a relation on \(\text{integer}\times \text{strings}\times \text{strings}\).

- For an \(n\)-ary relation \(R\), let \(C\) be some condition on the elements of \(R\). A selection operation maps \(R\) to the subset of \(R\) that satisfy the condition.
- For example, we could select form the above relation all people who have an “@zju.edu.cn” email address.
- Or select all entries with student number 3110012345.

- A projection \(P_{i_1,i_2,\ldots,i_m}\) maps the element \((a_1,a_2,\ldots,a_n)\in R\) to \((a_{i_1}, a_{i_2},\ldots,a_{1_m})\).
- That is, it selects only some of the original sets from the relation.
- If we only care about student number and email, records like (3110012345, "Student Name", "student@example.com") would be projected to (3110012345, "student@example.com").
- That is the projection \(P_{1,3}\) on the relation.
- In database terminology, we select only certain columns.

- A join on two relations is an operation that combines values based on similar data in the two relations.
- For example, suppose we have another relation on student number, course, and grade.
- We could join the student number, name, email address relation to this on the student number.
- The result would be a list of all possible combinations where the student numbers match in both relations.
- We would get a relation with values for student number, name, email address, course, and grade.

- Querying and manipulating databases of such relations is a problem for another course.
- Several courses, really.
- We would like to be able to ask questions like “which elements of the relation have some particular property?” and get the answers quickly (\(O(\log n)\) time or better).
- SQL (Structured Query Language) gives us syntax to perform the operations described above.
- … as well as to manipulate the data (insert, update, delete) in the relation.