# Relations

• A binary relation from set $$A$$ to set $$B$$ is a subset of $$A\times B$$.
• e.g. if we use $$A=\{1,2,3,4\}$$ and $$B=\{a,b,c\}$$, we could define a relation $$R=\{(1,a),(1,c),(2,b),(3,b)\}$$.
• Then we would say that $$(1,c)$$ has this relation, of that $$1$$ is related to $$c$$ by $$R$$.
• It is also common to write “$$1 \mathop{R}c$$” to indicate that two elements are related, or “$$2 \mathop{\not R}c$$” to indicate that they aren't.
• More realistic example: “less than” ($$<$$) is a relation on the real numbers.
• The set of values in the relation are $$\{(i,j)\mid i,j\in \mathbf{R}, i<j\}\subseteq \mathbf{R}\times \mathbf{R}$$.
• The $$1< 2$$ and $$2\not < 1$$ notation makes a lot more sense for the $$<$$ symbol.
• For consistency, we would also have to allow ourselves to write this, even though it looks insane: $(\text{<}) = \{(i,j)\mid i,j\in \mathbf{R}, i<j\}$
• I'd probably rather make up some notation like $$R_{<} = \{\cdots\}$$ if necessary.
• We could look a functions as a particular kind of relation.
• A function is a relation where we're only allowed to have one “result”.
• A relation $$R$$ is a function if $$(a,b_1),(a,b_2)\in R$$ implies $$b_1=b_2$$.
• For a function $$f:A\rightarrow B$$, we could define it as a relation $$f = \{(a,b)\mid a\in A, b=f(a)\}$$.
• It's fairly common to use a relation from a set to itself: a subset of $$A\times A$$.
• Call this “a relation on the set $$A$$”.
• e.g. less-than is a relation on $$\mathbf{R}$$.
• e.g. “divides” is a relation on the integers we defined earlier.
• We defined it as $$a\mid b$$ if there is an integer $$c$$ such that $$b=ac$$.
• Or we could now think of the set $$\{(a,b)\mid a,b\in\mathbf{Z}, b=ac\text{ for an integer }c\}$$.

# Properties of Relations

• Just having “a subset of the Cartesian product” isn't very interesting.
• Relations that have some particular properties are common, and can be more useful.
• We're generally concerned about relations on a particular set here: from a set to itself.
• A relation $$R$$ on the set $$A$$ is reflexive if $$(a,a)\in R$$ for all $$a\in A$$.
• That is, every element is related to itself.
• For example $$\le$$ is reflexive, but $$<$$ is not.
• The relation “divides” is reflexive since every integer divides itself.
• A relations is irreflexive if $$(a,a)\not\in R$$ for all $$a\in A$$.
• The relation $$<$$ is irreflexive.
• A relation is symmetric if $$(a,b)\in R$$ implies that $$(b,a)\in R$$.
• A relation is antisymmetric if $$(a,b)\in R$$ and $$(b,a)\in R$$ only when $$a=b$$.
• For example, $$<$$, $$\le$$, and divisibility are all antisymmetric.
• The “equals” ($$=$$) relation is symmetric.
• Congruence modulo $$k$$ is symmetric. That is, $$a$$ and $$b$$ are related iff $$a\equiv b\text{ (mod }k\text{)}$$ is symmetric.
• Some relations are neither symmetric nor antisymmetric: the relation $$\{(1,1),(1,2)\}$$ is neither.
• … it's also neither reflexive nor irreflexive.
• A relation is transitive if whenever $$(a,b)\in R$$ and $$(b,c)\in R$$, we then have $$(a,c)\in R$$.
• $$\le$$ and $$<$$ are both transitive.
• So is divisibility: if $$a\mid b$$ and $$b\mid c$$ then there are integers so that $$b=db$$ and $$c=eb$$ so $$c=(de)a$$ and $$a\mid c$$.
• A relation that isn't transitive: $$a\mathop{R}b$$ iff $$b=a+1$$. Then $$2\mathop{R}3$$ and $$3\mathop{R}4$$ but $$2\mathop{\not R}3$$
• Inequality ($$\ne$$) isn't transitive: $$4\ne 5$$ and $$5\ne 4$$ but $$4=4$$.
• Example: set $$S$$ be the set of cities with airports, and $$a\mathop{R}b$$ if there is a direct flight from $$a$$ to $$b$$.
• $$R$$ is definitely not reflexive. It's probably irreflexive since no airline would fly from a city and back (without stopping). [Unless you count sightseeing flights or something.]
• $$R$$ is symmetric: if there's a flight from Shanghai to Vancouver, then there's a flight from Vancouver to Shanghai. [Right? I don't know of any counterexamples.]
• $$R$$ is not transitive: there is a flight from Hangzhou to Beijing and Beijing to Vancouver, but none from Hangzhou to Vancouver.
• If we have a finite set $$A$$ with $$|A|=n$$, how many relations are there?
• There are $$2^{n^2}$$ relations on $$A$$ since there are $$n^2$$ elements of $$A\times A$$, and any subset is a relation.
• How many are reflexive?
• Of the $$n^2$$ elements of $$A\times A$$, we must take the $$n$$ values $$(a,a)$$ in the subset. The others can be chosen freely.
• So, $$2^{n^2-n}$$.
• How many are symmetric?
• If we can decide on an order for the $$n$$ elements, we can choose the “smaller” elements first (like $$(a,b)$$) and that implies the other order must also be included (like $$(b,a)$$).
• So, how many pairs $$(a,b)$$ can we choose where $$a\le b$$? There are $$n(n+1)/2$$. We choose from those, then since the relation is symmetric, we know we must include/exclude $$(b,a)$$.
• So, $$2^{n(n+1)/2}$$ relations.
• How many are transitive?

# Combining Relations

• Since relations are just subsets of a Cartesian product, we can combine them with regular set operations.
• For example, if we take $$R_{<}$$, $$R_{>}$$, $$R_{\le}$$, $$R_{\ge}$$ to be the sets for relations less-than, greater-than, etc., then
• $$R_{\le}\cap R_{\ge}$$ is the equality relation, $$R_{=}$$.
• $$R_{<}\cup R_{>}$$ is the inequality relation, $$R_{\ne}$$.
• $$R_{\le}-R_{=}=R_{<}$$.
• Or we could use an operation we used on functions and compose relations.
• That is, we we have two relations $$R_1\subseteq A\times B$$ and $$R_2\subseteq B\times C$$, then we can define $$A\circ B$$ with elements $$(a,c)$$ where there is a $$b\in B$$ such that $$(a,b)\in R_1$$ and $$(b,c)\in R_2$$.
• For example, the “plus one” relation that we had earlier: $$a\mathop{R}b$$ iff $$b=a+1$$.
• If we compose it with “greater than or equal” over the integers, $$R\circ R_{\ge}$$, we get $$R_{>}$$ since $$m\ge n$$ iff $$m+1 > n$$ for integers $$m,n$$.
• If we compose it with itself, $$R\circ R$$, we get “plus two”: $$a\mathop{(R\circ R)}b$$ iff $$b=a+2$$.
• We can also continue to compose a relation with itself.
• We can define $$R^n$$ as $$R^1=R$$ and $$R^{n+1}=R^n\circ R$$.
• So $$R^3=(R\circ R)\circ R$$ and $$R^4=((R\circ R)\circ R)\circ R$$.
• For the “flights between cities” relation…
• $$R^2$$ is the cities you can get to in exactly two flights. So, $$(\text{Hangzhou},\text{Vancouver})\in R^2$$ (or we could write $$\text{Hangzhou} \mathop{R^2} \text{Vancouver}$$, but that's awkward).
• $$(\text{Hangzhou},\text{Shanghai})\in R^2$$ since $$(\text{Hangzhou},\text{Beijing}),(\text{Beijing},\text{Shanghai})\in R^2$$. We haven't said anything about needing two flights.
• $$R^2-R$$ is the relation where you must take two flights (there is no direct route).
• $$R\cup R^2 \cup R^3\cup\cdots$$ are cities that you can fly between, taking any number of flights.
• Theorem: A relation on a set is transitive if and only if $$R^n\subseteq R$$ for all $$n\ge 1$$.

Proof: First assume that $$R^n\subseteq R$$ for all $$n\ge 1$$ (to show the “if” part). If we have $$(a,b),(b,c)\in R$$ then $$(a,c)\in R^2$$. From the assumption, $$(a,c)\in R$$, so $$R$$ is transitive.

Now assume that $$R$$ is transitive. For $$n=1$$, $$R^1\subseteq R$$ by definition. We can show that $$R^n\subseteq R$$ for larger $$n$$ by induction.

Assume for induction that $$R^n\subseteq R$$. If we have an element $$(a,b)\in R^{n+1}=R^n\circ R$$ then there is a $$c$$ such that $$(a,c)\in R^{n}\subseteq R$$ and $$(c,b)\in R$$. Since $$R$$ is transitive and we have two elements of $$R$$, we know that $$(a,b)\in R$$. Thus $$R^{n+1}\subseteq R$$.

By induction, $$R^{n}\subseteq R$$ for all $$n$$.

• The inverse relation of a relation $$R$$ is the set $$R^{-1}=\{(b,a)\mid (a,b)\in R\}$$.
• For example, the inverse of “less than” is “greater than”, since $$a<b$$ iff $$b>a$$.
• Theorem: A relation is symmetric iff $$R=R^{-1}$$.

Proof: First consider a symmetric relation $$R$$ and any $$(a,b)\in R$$. Since $$R$$ is symmetric, we know that $$(b,a)\in R$$. Thus, we have $$R\subseteq R^{-1}$$.

Now take any $$(a,b)\in R^{-1}$$. By definition, $$(b,a)\in R$$ and by symmetry, $$(a,b)\in R$$. So, $$R^{-1}\subseteq R$$ and we have $$R=R^{-1}$$.

Finally, suppose $$R=R^{-1}$$. For any $$(a,b)\in R$$, we know $$(b,a)\in R^{-1}=R$$. So, $$R$$ is symmetric.

# Relation-Like Things

• In a lot of ways, relations aren't really new concepts.
• They're like a generalization of functions (as mentioned before).
• We just remove the restriction that a value must map onto a unique image.
• Relations are a lot like predicates with two arguments.
• We could have defined a predicate $$P(a,b)$$ to be true iff $$a<b$$.
• Or we could define a relation that contains all $$(a,b)$$ where $$a<b$$.
• The predicates could be combined with logical operators ($$\vee,\wedge,\ldots$$), and the relations with set operators ($$\cup,\cap,\ldots$$).
• The results are fundamentally the same, but one or the other might be easier notation to work with.
• In particular, things like “less than” are usually thought of as relations, so there are plenty of theorems you'll find using that notation.
• One more way to think of these things: as Boolean functions.
• That is, functions in a programming language that return true or false.
• e.g. a Java method:
public boolean lessthan(int a, int b) {
return a<b;
}
• e.g. in Haskell you can actually define a new operator that works like a relation:
(<) :: Int -> Int -> Bool
(<) a b = not (a>b)
5 < 6                  -- uses the above code to get a result

# Matrices for Relations

• If we have a relation on a (relatively small) finite set, there are a few ways we can represent them that might be helpful for visualizing them or reasoning about them.
• Suppose we have a relation $$R$$ from $$A=\{a_1,a_2,\ldots,a_m\}$$ to $$B=\{b_1,b_2,\ldots,b_n\}$$.
• We can represent this as an $$m\times n$$ matrix, which we'll usually refer to as $$\mathbf{M}_R$$.
• The $$i,j$$ entry of the matrix will be 1 if $$(a_i,b_j)\in R$$ and 0 otherwise.
• This gives us an efficient way to represent arbitrary relations in a computer: we need $$mn$$ bits to store everything we need.
• For example, if we have $$A=\{a,b,c\}$$ and $$B=\{x,y\}$$ and a relation $$R=\{(a,x),(a,y),(b,x),(c,y)\}$$, then we will represent it with $\mathbf{M}_R=\begin{bmatrix}1 & 1\\1 & 0\\0 & 1\end{bmatrix}$
• Note that this requires an order for the elements of the sets.
• The order doesn't matter, as long as we pick one and stay with it.
• For a relation on a set $$A=\{a_1,a_2,\ldots,a_n\}$$, we will have an $$n\times n$$ matrix.
• A reflexive relation will have 1s down the diagonal. For example the “equals” relation on the set $$A=\{a,b,c\}$$ is $$R=\{(a,a),(b,b),(c,c)\}$$ and has the matrix $\mathbf{M}_R=\begin{bmatrix}1 & 0& 0\\0 & 1& 0\\0 & 0& 1\end{bmatrix}$
• A symmetric relation will have a 1 in position $$i,j$$ iff there is a 1 in $$j,i$$.
• So, it is a mirror image across the diagonal.
• In other words, it will be its own transpose: $$\mathbf{M}_R=(\mathbf{M}_R)^t$$.
• We can calculate the composition of relations from their matrix as well.
• Recall that $$R\circ S$$ has elements $$(a,c)$$ where there is a $$b$$ such that $$(a,b)\in R$$ and $$(b,c)\in S$$.
• So, element $$(i,j)$$ of the matrix $$\mathbf{M}_{R\circ S}$$ will be one when there is a $$k$$ where $$(i,k)$$ and $$(k,j)$$ are both one.
• That is, when $$(a_{i1}\wedge a_{1j})\vee (a_{i2}\wedge a_{2j})\vee\cdots \vee (a_{in}\wedge a_{nj})$$.
• The textbook uses the notation $$\mathbf{M}_{R\circ S}=\mathbf{M}_{R}\odot \mathbf{M}_{S}$$ for this operation.
• For example, suppose we have a relation $$R$$ on $$\{a,b,c,d\}$$ with this matrix $\mathbf{M}_R=\begin{bmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}$
• We can tell that the relation isn't reflexive: the $$2,2$$ entry is missing.
• It is symmetric: transposing it would not change it.
• The relation $$R^2$$ has the matrix $\mathbf{M}_{R^2}=\begin{bmatrix}1&1&1&0 \\ 1&1&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{bmatrix}$
• That is also the same as $$\mathbf{M}_{R^3}$$, $$\mathbf{M}_{R^4}$$, …

# Digraphs for Relations

• A directed graph or digraph is a set $$V$$ of vertices (or nodes, singular is “vertex”) and $$E$$ of edges that connect vertices.
• If we have vertices $$a$$ and $$b$$, then an edge that connects $$a$$ to $$b$$ is $$(a,b)$$.
• The edges have a direction: $$(a,b)\ne(b,a)$$.
• That's why it's a directed graph.
• We usually draw digraphs like this:
• Each dot is a vertex.
• Each arrow is an edge.
• A digraph can also represent a relation.
• Vertices represent the elements of the set(s).
• Edges represent the elements of the relation.
• The above digraph is the same relation as the matrix example: $\mathbf{M}_R=\left[\begin{smallmatrix}1&0&1&0 \\ 0&0&1&0 \\ 1&1&1&0 \\ 0&0&0&1 \end{smallmatrix}\right]$
• From the digraph, we can see a few things easily:
• If a relation is reflexive, there will be a loop on each vertex.
• If it's irreflexive, there will be none.
• If it is symmetric, each arrow will be accompanied by another in the opposite direction (like between $$b$$ and $$c$$ above).
• It's antisymmetric if that never happens.
• If it is transitive, every pair of adjoining edges will also have another that does the same thing in one step, like this:
• The inverse of a relation can be found by just reversing each arrow.
• For a relation that's not on a single set (is a subset of $$A\times B$$ with $$A\ne B$$), we can still draw the graph:
• That's probably not as interesting as for relations in $$A\times A$$.

# n-ary Relations

• We introduced out subsets of $$A\times B$$ as “binary relations“. That should be a hint that there's some other kind of relations.
• For sets $$A_1,A_2,\ldots,A_n$$, an n-ary relation on those sets is a subset of $$A_1\times A_2\times \cdots\times A_n$$.
• For example, we could have a relation on $$\mathbf{Z}\times\mathbf{Z}\times\mathbf{Z}^{+}$$ consisting of triples $$(a,b,m)$$ where $$a\equiv b\ (\text{mod } m)$$.
• Then $$(3,18,5)$$ and $$(4,-5,9)$$ are in this relation, but $$(1,2,5)$$ and $$(9,100,10)$$ are not.
• We could also think of rows in a database as forming a relation.
• For example, a database table might have the following data in each row: student number, name, email address.
• Then the table basically defines a relation on $$\text{integer}\times \text{strings}\times \text{strings}$$.
• For an $$n$$-ary relation $$R$$, let $$C$$ be some condition on the elements of $$R$$. A selection operation maps $$R$$ to the subset of $$R$$ that satisfy the condition.
• For example, we could select form the above relation all people who have an “@zju.edu.cn” email address.
• Or select all entries with student number 3110012345.
• A projection $$P_{i_1,i_2,\ldots,i_m}$$ maps the element $$(a_1,a_2,\ldots,a_n)\in R$$ to $$(a_{i_1}, a_{i_2},\ldots,a_{1_m})$$.
• That is, it selects only some of the original sets from the relation.
• If we only care about student number and email, records like (3110012345, "Student Name", "student@example.com") would be projected to (3110012345, "student@example.com").
• That is the projection $$P_{1,3}$$ on the relation.
• In database terminology, we select only certain columns.
• A join on two relations is an operation that combines values based on similar data in the two relations.
• For example, suppose we have another relation on student number, course, and grade.
• We could join the student number, name, email address relation to this on the student number.
• The result would be a list of all possible combinations where the student numbers match in both relations.
• We would get a relation with values for student number, name, email address, course, and grade.
• Querying and manipulating databases of such relations is a problem for another course.
• Several courses, really.
• We would like to be able to ask questions like “which elements of the relation have some particular property?” and get the answers quickly ($$O(\log n)$$ time or better).
• SQL (Structured Query Language) gives us syntax to perform the operations described above.
• … as well as to manipulate the data (insert, update, delete) in the relation.