Intro

We have made some claims about the number of some things.
- \(|P(S)|=2^{|S|}\)
- \(|A\times B| =|A|\cdot|B|\)
- Six bits can represent 64 distinct values.
- These have been unproved.
We will cover several topics where the basic underlying questions are “how many…?” or “in how many ways…?” or sometimes, “what are all of the ways…?”
There are several techniques to answer these questions, depending on the details of what is being counted.
Some books call this combinatorics because some people like making things sound difficult.
- We'll stick with counting.

Basics: Adding and Multiplying

A lot of counting problems can be solved with two simple (and maybe obvious) techniques.
The product rule: If there are \(m\) ways to select the first item, and \(n\) ways to select the select the second item, then there are \(mn\) ways to select the two items.
- … and so on for three or more things to be selected.
Example: You go into a fast food restaurant. You have the choice of 6 meal combos. Each one comes with a drink, and there are 5 choices for the drink. There are \(6\cdot 5=30\) possible meals you can order.
Example: How many bits strings are there of length 8. (i.e. How many bytes are there?)
- Bit string: a string of 0s and 1s. For example, “00101” and “11110” are bit strings of length 5.
- For length 8, there are two ways to select the first bit, two ways to select the second, …
- So in total, \(2^8=256\) possibilities for length 8.
Example: In poker, each player receives a five card hand from a deck of 52 cards. How many possible hands are there?
- First card: 52 possibilities.
- Second card: 51 possibilities (since you can't get the first card again).
- Third: 50. Fourth: 49. Fifth: 48.
- All together, there are \(52\cdot 51\cdot 50\cdot 49\cdot 48 = 311875200\) possible hands.
- Not quite right: that counts the hand 2♥,3♥,4♥,5♥,6♥ as different than the hand 6♥,5♥,4♥,3♥,2♥. Most people would think those are the same.
- We'd better come up with a way to fix that.
- The answer is right if the order does matter. (Which is somewhat true in hold 'em, for example.)
Theorem: For finite sets \(S\) and \(T\), the Cartesian product has cardinality \(|S\times T|=|S|\cdot|T|\).

Proof: Consider elements \((s,t)\in S\times T\). There are \(|S|\) ways to choose the first element in the pair, and \(|T|\) ways to choose the second. By the product rule, there are \(|S|\cdot|T|\) ways to choose the pair, so \(|S\times T|=|S|\cdot|T|\).∎
Theorem: For finite sets \(S\), the power set of \(S\) has \(|P(S)| =2^{|S|}\) elements.

Proof: Since \(P(S)\) is the set of all subsets of \(S\), we must count the number of subset of \(S\).

We can form a subset of \(S\) by either including or not including the first element (two possibilities). Similarly, we can either include or exclude the second element, and the third, and so on for each element of \(S\).

Thus we have \(2^{|S|}\) possible subsets of \(S\) so \(|P(S)| =2^{|S|}\).∎
The sum rule: If you if you can select an item in one of \(m\) ways, or in one of \(n\) ways, then there are \(m+n\) ways to make the selection.
Example: There are 25 computing courses being offered this semester, and 15 math courses. If you want to sign up for either a computing or a math course, then you have \(25+15=40\) possible choices.
Example: Suppose \(S\) and \(T\) are sets with no elements in common (i.e. \(S\cap T=\emptyset\)). Then \(|S\cup T|=|S|+|T|\).
As you can see: the sum rule isn't very exciting on its own. It is useful when doing more complex counting problems.
Example: If a poker player plays two hands, how many combinations of cards are possible if the order the cards are received matters? (The deck is re-shuffled between hands.)
- First hand: 311875200 possibilities.
- Second hand: 311875200 possibilities.
- All together, there are \(311875200+311875200=623750400\) possible hands.
How many bit strings of length 5 contain “111”?
- Possible ways for that to happen:
  - the “111” starts in the first position: “111??”. 4 ways.
  - the “111” starts in the second position: “0111?”. 2 ways.
  - the “111” starts in the third position: “?0111”. 2 ways.
- So there are \(4+2+2=8\) such strings.

Aside: Counting and Running Time

Remember that calculating the running time of an algorithm comes down to counting the number of steps the algorithm takes for particular input.
- So, these counting techniques are often useful.
Example: How many lines does this algorithm print?
```
procedure do_something(n)
    for i from 1 to n:
        print i
    for i from 1 to n:
        for j from 1 to n:
            print i,j
```
Answer: The lines it prints are all all possible choices from \(\{1,2,\ldots,n\}\) and then all choices from \(\{1,2,\ldots,n\}\times\{1,2,\ldots,n\}\). So, it prints \(n+n^2\) lines.

Notice: for running time, we didn't have to worry about the order the lines are printed. Just enumerating the possibilities as enough to get the answer in this case. That is often good enough: how many elements are in the list of steps the algorithm takes?

If we had been asked “what is the big-Θ running time”, our answer would be \(\Theta(n^2)\).

Principle of Inclusion-Exclusion

We used the sum rule to count the number of elements of \(|S\cup T|\), where \(S\) and \(T\) were disjoint sets.
- They had to be disjoint, otherwise we would have been wrong: for \(S=\{1,2,3,4\}\), \(T=\{3,4,5\}\), we have \(|S|=4\), \(|T|=3\), \(|S\cup T|\ne 7\).
- In this example, we counted the 3 and 4 (the elements of \(S\cap T\)) twice.
In cases like this, where we count something twice, the solution is obvious: only count it once.
- Add the two values together (include everything), counting some things twice, then subtract the duplicates (exclude them).
- The principle of inclusion-exclusion, or maybe the subtraction principle.
- Or in other words, \[|S\cup T| = |S|+|T|-|S\cap T|\,.\]
- In our example, this gives \(|S\cup T|=4+3-2=5\), which is correct.
Example: How many integers from 1 to 100 are divisible by 3 or 5?
Answer: There are \(\lfloor 100/3\rfloor=33\) that are divisible by 3, and \(\lfloor 100/5\rfloor=20\) that are divisble by 5. There are \(\lfloor 100/15\rfloor=6\) that are divisible by both. So, there are \(33+20-6=47\) that are divisible by either.
Example: How many integers from 1 to 100 are divisible by 4 or 6?
Answer: There are \(\lfloor 100/4\rfloor=25\) that are divisible by 4, and \(\lfloor 100/6\rfloor=16\) that are divisble by 6. There are \(\lfloor 100/12\rfloor=8\) that are divisible by both. So, there are \(25+16-8=33\) that are divisible by either.
Example: How many bit strings of length 5 do not contain “111”?
Answer: Let \(S\) be the set of bit strings of length 5; \(|S|=2^5=32\). Let \(T\) be the set of length 5 strings that contain “111”. From the above, \(|T|=8\). We are interested in \(|S-T|\). Since we know that \(T\cap(S-T)=\emptyset\), we have \(|S-T|=32-8=24\).

The Pigeonhole Principle

This is an idea that's simple enough, it probably doesn't really need a name.
- But since it has a name, we have something to write in a proof/answer to justify it: “because of the pigeonhole principle…”.
The pigeonhole principle: Theorem: If \(k+1\) objects are put in \(k\) boxes, at least one box has two or more objects in it.
The generalized pigeonhole principle: Theorem: If \(N\) objects are distributed into \(k\) boxes, then there is a box with \(\lceil N/k\rceil\) or more objects.
Proof: Suppose for contradiction that each box has \(\lceil N/k\rceil-1\) or fewer objects. Then the total number of objects is at most \[k(\lceil N/k\rceil-1) < k((N/k+1)-1) = N\,.\] So, there are less than \(N\) objects accounted for and we have a contradiction.∎
Example: (Some applications of these are obvious.) There are 56 people in this course; homework 3 was marked out of 20. Therefore, at least \(\lceil 56/20\rceil=3\) people received the same mark on homework 3.
Example: A function from a set with \(n\) elements to a set of less than \(n\) elements is not one-to-one. By the pigeonhole principle, there must be some value in the range that has more than one value from te range mapped onto it.
Example: There are at least 13 people in Hangzhou with exactly the same number of hairs on their head. People have a maximum of around 500k hairs on their head. Hangzhou has a population of about 6.2 million. Dividing, we get \(\lceil 6.2\times 10^6/5\times 10^5\rceil = 13\).
Some uses are less obvious:
Example: For every integer \(n\), there is a multiple of \(n\) whose decimal expansion contains only 0s and 1s.

Proof: Consider \(n+1\) integers in the form \(1,11,111,1111,\ldots\), and the values of each modulo \(n\).

By the pigeonhole principle, at least two of them must have the same remainder modulo \(n\). If we subtract the smaller from the larger, we get an integer that is divisible by \(n\) and has only 0s and 1s in its decimal expansion.∎
Example: No (lossless) data compression algorithm works for every input.
Proof: Suppose not, that you have an algorithm that can compress every \(n\) bit sequence into a sequence of \(n-1\) or fewer bits, and then uncompress that to get the original data back.

There are \(2^n\) original messages, but less than \(2^{n-1}\) compressed results. By the pigeonhole principle, at least two of the uncompressed messages must generate the same compressed result. It is impossible to uncompress those to get the original (different) messages back.∎
Note: the previous theorem doesn't mean that some compression algorithms can't work very well on a lot of very useful messages, just that there are also some that they do badly with.