I think that one of the most beautiful branches of mathematics is discrete mathematic, and here I would emphasize the combinatorics and probability theory.

## Background

While I was in school I learned all about factorials, permutations, combinations and how to compute the probability that an event will happen. Honestly I never understood it like I do today (20 years later). It has probably something to do with the fact that I wasn't enough interested in this topic at that time and/or with the (lack of) vision that our math teacher had shown regarding this subject. Recently, perhaps because I felt an empty space somewhere in my belt of math tools, I've began to learn again these things, having a different approach from a different perspective. Now hopefully everything is much clear than before.

## Introduction

If you are keen into studying these topics I would suggest you few things you should take into consideration.

To be able to learn properly these stuff you should firstly start learning the basic of the set theory like what are the sets and the operations between sets (like the intersection/conjunction, union, complement set and so forth).

Once you you have the grasp of the notions mentioned earlier you should review/learn what factorial means and when to use it. After that you should continue reading about permutation and combination and what's most important the difference between them and when/how to use them. Now, if all of these are done then you should already know that combinatorics is all about counting. And because some counting are more cumbersome and/or tedious than others you should know that in counting there are few principles that they are going to be your life preserver in these area:

- the addition principle: when you want to count something that meats a certain condition OR another condition then that is equivalent with counting each condition separately then adding the results
- the multiplication principle: when you want to count that something meats a certain condition AND another condition then that is equivalent with counting each condition separately then multiplying the results
- the complement principle: sometimes it's easier to count the elements that are not in the subset than those that are in the subset; by knowing the set elements and those not contained by the desired subset you will get the elements that are within the desired subset
- the "counting double in a controlled way" principle; when to count and when to not count twice (meaning counting multiple times elements that are equivalent or the same)
- "when some element are the same" principle : to know that counting "this" is equivalent with counting something else even if they seem completely different at the first glance (for instance counting that something appears at least once in a subset is equivalent with 1- that thing appears 0 times in that subset, which is far easier than counting how many times something appears in a subset)

If there is something that you really don't understand so far then please, please, review these topics again and again until they become plain as a day to you. Only then you should go to the next step otherwise you're walking on thin ice. Combinatorics DOES NOT mean learning by hearth some rules/formulae and trying to apply them blindly (without understanding how to count manually these things). If you know how to count manually these things then you are ready to use some "tools" and principles that will accelerate and simplify the computation. If you don't have a clue how to do it manually then there are small chances that you know what is the proper mathematical tool and how/when to use it.

In this article I will go through each of these by emphasize the theory behind and by providing some useful examples that will help you to see the counting and probabilities from different perspectives.

## How many they are?

Let's assume you have two dices. You want to compute all the possible arrangements that you can get by rolling this dices. How many they are?

Well, each dice has 6 sides marked distinctly from one dot to six dots:

How many possible arrangements can we get with only one dice? The answer is simple: 6^{1}=6.

What if we use two dices? If we want to combine these two dices we could create a matrix 6x6 and then we can see all these arrangements:

6 |
{6,1} | {6,2} | {6,3} | {6,4} | {6,5} | {6,6} |

5 |
{5,1} | {5,2} | {5,3} | {5,4} | {5,5} | {5,6} |

4 |
{4,1} | {4,2} | {4,3} | {4,4} | {4,5} | {4,6} |

3 |
{3,1} | {3,2} | {3,3} | {3,4} | {3,5} | {3,6} |

2 |
{2,1} | {2,2} | {2,3} | {2,4} | {2,5} | {2,6} |

1 |
{1,1} | {1,2} | {1,3} | {1,4} | {1,5} | {1,6} |

1 |
2 |
3 |
4 |
5 |
6 |

As we can see if we have two dices there are 6^{2}=36 possible arrangements.

What if we have three dices? We can create a matrix of 6x6x6 (which can be graphically visualize as a 3D cube) and represent all these possible arrangements. They will be 6^{3}=216 possible arrangements.

In general we can combine n-dices in 6^{n} ways.

If we have a coin instead a dice, it will have only two sides: head and tail. So with one coin there are only 2^{1} possible arrangements. With two coins there will be 2^{2}. In general we can combine n-coins in 2^{n} ways.

For the general case: if we have a set of m-elements and we want to combine this set by n possible different ways with themselves, we will get m^{n} pairs of possible arrangements.

## Factorial

By definition n-factorial is noted with .

#### Example 1

We have five different letters A,B,C,D,E. In how many ways can we arrange these letters if we are allowed to use them only once? Let's see:

- I will note the number of unknown arrangements with x
- when we start the computation we have 5 available letters, so we have 5 different choices; so at this step x=5 and the remaining letters are 4
- next we have only 4 different letters available so for the 2nd character we have only 4 different choices; thus at this step x=5*4 and the remaining letters are 3
- next we have 3 different letters available; so for the 3rd character we have 3 different choices; thus at this step x=5*4*3 and the remaining letters are 2
- next we have 2 different letters available; so for the 4th character we have 2 different choices; thus at this step x=5*4*3*2 and the remaining letters are 1
- finally we have only 1 remaining letter; so for the 5th character we have only 1 choice; thus at this step x=5*4*3*2*1 and this is our final step because we have no more letters to use

The answer for this problem is x=5*4*3*2*1=5!

**Note**: by convention 0!=1 (behind it lies the same concept as in x^{0}=1).

#### Example 2

We have the following letters: MISSISSIPPI. In how many ways can we arrange these letters such that we can create distinct unique words?

If we would use all these 11 letters than:

- for the 1st position we would have 11 available letters
- for the 2nd position we would have 10 available letters (one has already been used at the previous step)
- for the 3nd position we would have 9 available letters (two have already been used at the previous steps)
- ...
- for the 10th position we would have 2 available letters (nine have already been used at the previous steps)
- for the 11th position we would have only 1 available letter (ten have already been used at the previous steps)

So we could create 11! words BUT because some of these 11 letters (eg. I,S,P) appears more than once it will be obvious that some words will be the same (i.e. not unique). Note that we are interested only in those distinct unique words.

As we can see the letter I appears 4 times. The number of different arrangements that we can get by combining 4 letters are 4!. So if we divide our 11! by this 4! then we will get only these words where I appears only once.

We do the same for S and also for P.

So if we divide our 11! by 4! * 4! * 2! then we will get only these distinct unique words i.e. 34650 unique words instead 11!=39916800 words (in which case some words would be duplicated).

#### Conclusion

When you have n-elements and want to know in how many different ways you can arrange them (without using the same letter more than once) then the answer is n!

**Note**: What if a letter appears more than once in the initial set of elements? Let's take the set of {A,B,C,A,D} for instance. How to determine in how many ways we can arrange these letters so that no one appears more than once? Well, for each letter that appears more than once you should divide your answer to the number of repetitions (r) factorial, i.e. r! So in the case of the set {A,B,C,A,D} your answer should be divided by 2! because we have two letters of A. If they would be three then we would divide by 3!

## Permutations

Permutations answer to the question "in how many ways can n-elements be arranged in a set of k-elements when the position/order **DOES** matter". By the position/order **does** matter I mean the following pairs of elements are regarded as different although they contains exactly the same elements (but their position/order matters): {A,B,C} and {B,C,A} and {A,C,B} and {B,A,C}.

In case that the letters' position doesn't matter then the set {A,B,C} would be regarded as equivalent (the same) with the set {A,C,B} or {B,C,A} and so forth because the element's position doesn't matter, what really matters is that the items A and B and C are contained by that set. It's like when you play cards and you have a hand of two aces and three kings. It doesn't matter if your hand is ordered like {ace,king,ace,king,king} or {king,ace,ace,king,king} because these cards means the same to you, what matters to you is that you have them and not in which order they are positioned/ordered in your hand.

So remember that in permutation the order DOES matter. If you are still confused please read again the both paragraphs above.

To convince you that there are situations when the elements' order matter I will give you this example:

We have five people and a car with four places. We take randomly a person at time and put it on a seat. The people will get their seats as following: the first person will get the driver seat, the next one will take the copilot seat, the third one will take the seat behind the driver seat and the forth will take the seat behind the copilot seat. The fifth person would eventually be placed in the car trunk. As you can see the order/position where we place these people matters: there is a big difference between sitting on the driver seat and sitting in the car trunk, right?

#### Example

We have a set of five different letters {A,B,C,D,E}. We want to create words of three letters. In how many ways can we arrange these letters if we use them only once? Let's see:

- let's note the number of elements in our set with n; in our example n=5;
- let's note the length of the word that we want to create with k; in our example k=3
- let's note the number of all possible arrangements with x; we are going to compute in how many ways we can arrange these letters, i.e. we're going to compute the value of x
- at the beginning we have 5 different letters to chose from; thus at this step x=5 and the remaining letters are 4
- next we have only 4 different letters to chose from; so for the second character there are 4 different choices; thus at this step x=5*4 and the remaining letters are 3
- next we have only 3 different letters to chose from; so for the third character there are 3 different choices; thus at this step x=5*4*3 and the remaining letters are 3
- for the last position we have 3 different letters available; so x=5*4*3 and this is our final step because we reached the maximum allowed limit of the word which was k=3

The answer for this problem is

Permutations of n taken by r is sometimes written as or function *nPr* on your calculator.

**Note**: if some letters appears more than once in the initial set (e.g. the set is {A,B,C,A,D}) you'll get some duplicated words and thus, for each letter that appears more than once in the original set you should divide the final result by the number of repetitions (r) of that letter factorial, i.e. r!

#### Conclusion

When you have n-elements and want to arrange them in set of k-elements (with k<=n) you should know that there are arrangements.

## Combinations

Combinations answer to the question "in how many ways can n-elements be arranged in a set of k-elements when the position/order DOESN'T matter". See the permutation if you are confused about he meaning of "the position/order does/doesn't matter".

#### Example 1

You have n=5 different people called A,B,C,D,E and you want to pair them such that they can play tennis in doubles, i.e. by k=2. In how many ways you can arrange these people such that you get different teams? Note that having a team formed by {A,B} or a team formed by {B,A} would be the same thing so we will count these only once.

Below I created a matrix that shows all the possible arrangements:

{A,B} | {A,C} | {A,D} | {A,E} |

{B,A} | {B,C} | {B,D} | {B,E} |

{C,A} | {C,B} | {C,D} | {C,E} |

{D,A} | {D,B} | {D,C} | {D,E} |

{E,A} | {E,B} | {E,C} | {E,D} |

Those 20 pairs were calculated as . Note that some of them appear twice, like {A,B} and {B,A} which in fact means the same people (I don't care in which order you show their names, they are the same!). We have 10 such equivalent pairs:

- {A,B} <-> {B,A}
- {A,C} <-> {C,A}
- {A,D} <-> {D,A}
- {A,E} <-> {E,A}
- {B,C} <-> {C,B}
- {B,D} <-> {D,B}
- {B,E} <-> {E,B}
- {C,D} <-> {D,C}
- {C,E} <-> {E,C}
- {D,E} <-> {E,D}

Because we are interested in only those pairs that are unique we should divide the total (non-unique) arrangements (which is 20) by k! i.e. by 2!

Finally we've got our unique pairs which are {A,B}, {A,C}, {A,D}, {A,E}, {B,C}, {B,D}, {B,E}, {C,D}, {C,E} and {D,E}.

But what if we have n=1 million people and we want to pair them in teams of k=5 or k=555? That would be extremely hard to count, isn't it? If you do this like I did in the example above then YES, it would be both cumbersome and tedious and moreover it would be definitely error-prone. So there must be a "tool" for this, right?

Well, let's see what a combination means: a combination is nothing more than a permutation with the exception that we want to exclude the duplicates, meaning that we want to avoid the double counting. How we avoid the double counting? Well, we actually count it normally (like in permutations) but we divide by the k! because there will be exactly k! equivalent pairs.

#### Conclusion

In general combination of n-elements taken by r is calculated as or function *nCr* on your calculator.

## Probability

Now it's time to use these combinatoric notions in a more useful way. The probability is all about computing the chance that an event is likely to happen.

Let's start defining few notions we are going to use when we will talk about probability:

- experiment: it can be anything from flipping a coin, rolling dices, randomly selecting a marble from a jar, etc
- result (outcome): the result of an experiment is called outcome (getting a three by rolling a dice, the "dice is three" is the result of the experiment)
- the sample space: is formed by considering all different collections of possible results (eg. rolling two dices can produce theoretically 36 different results: {1,1},{1,2},...{6,5},{6,6})
- the space of experiment: the collection of results produced by the experiment (eg. rolling a dice twice produces a six and a three; the space of experiment is given by {6,3}); the space of experiment is a subset of the sample space
- event: a combination of outcomes (eg. if the experiment is to flip one fair coin, the event A might be A="getting at most one head")
- the complement of an event: consists of all outcomes that are NOT in the event outcome (eq. if the event is "getting a 6 when rolling a dice" then its complement is "not getting a 6 when rolling a dice");
- probability: if the event is A then the probability that the event A occurs is noted with P(A) and is ; note that the probability of an event is a number between 0(0% chances to happen) and 1 (100% chances to happen)
- if the event is A and it's probability is P(A) then P(A)+P(not A)=1; thus, by using the probability of the complement we can compute the the probability of the event A : P(A)=1-P(not A)
- a fair coin=a coin where the chance to get a head or a tail is evenly distributed, namely it is equal to 1/2. A not fair coin (biased) is a coin that due to a defect is more likely to land a side than the other side (e.g. 75% to get a head and only 25% to get a tail)
- a fair dice=a dice where the chance to get either an one or two or ... a six is evenly distributed, namely is equal to 1/6. A not fair dice (biased) is a loaded dice or a dice that due to a defect is more likely to land on a side than the others (e.g. 25% chance to land on 6, 23% chance to land on 5, and let's say 13% to land on any other side).

#### Example 1

You have a fair coin and you flip it. What is the chance that you get a head? What about the chance of getting a tail? Let's see: you can get either a head or tail so it's one or the other. With other words is one of two or 1/2 (1/2=0.5=50%). What are the chances that you get head two times in a row? Let's put it in this way: what are the chance that on the first flip you get a head AND on the second flip you get also a head? We compute the chance to get a head on the first flip (which is 1/2) and then we compute the chance to get a head on the second flip (which is also 1/2). Because we deal with the conjunction AND (head on first AND head on second) we should know (see the set theory) that we have to multiply these chances with each other, to be more exact the cumulated chances to get a head on the first flip (1/2) AND the chance to get a head on the second flip (1/2) are 1/2 * 1/2 = (1/2)^{2}. If we flip n-times the coin and we are asking what is the chance of getting a head n-times in a row then the probability is (1/2)^{n}. This is an example of where we use the multiplication principle named earlier.

#### Example 2

You have a fair dice and you roll it. What is the chance that you get a six? What about the chance of getting a three? Let's see: you can get either an one or a two, or ... or a six; so there is one chance of six possible choices. What are the chances that you get also a six when you roll the second time the same dice? When you roll again the dice the chance that you'll get a 6 will be also one of six possible choices (so it's also 1/6). The chances that the first event occur AND the second event occur is, by the multiplication principle named earlier, 1/6 multiplied by 1/6, i.e. (1/6)^{2}. What are the chances you get three six on a row, i.e. a six on the first try AND a six on the second try AND a six on the third try. Well, that would be (1/6)^{3}. This is also an example of where we use the multiplication principle.

#### Example 3

You have a fair dice and you roll it three times. What is the chance that you will get two of six AND a three?

To roll a dice three times is the same as rolling three different dices at once (also the reciprocal is true). In how many ways can we arrange these three dices (if we care about the order) ? With only two dices we would have 6*6=6^{2} possibilities. With three dices we would have 6*6*6=6^{3} possibilities (in general with n-dices we would have 6^{n} possibilities).

So the sample space has 36 possible outcomes. Of these we'll get two of six and a three in only three experiments, so our space of experiment has three outcomes:

1st way the event can happen |
6 | 6 | 3 |

2nd way the event can happen |
6 | 3 | 6 |

3rd way the event can happe |
3 | 6 | 6 |

The probability to get one of these three outcomes would be 1/6^{3}. The probability to get all these three combinations is thus 3*(1/6^{3}). This is our final answer.

#### Example 4

You have a fair dice and you roll it three times. What is the chance that you will get two of six AND the third dice something different than six?

We will try to count this using some other method. In this scenario we have only two options: to get a six (to succeed) or to NOT get a six (to fail).

We will note with n=number of times the dice is rolled, i.e. n=3. We note with r=the number of times we want to succeed, i.e. r=2. We note with p=the probability that we succeed when we throw only a dice (i.e. to get a six). With a fair dice the chance would be 1/6, i.e. p(A)=1/6. The complement of to succeed is to fail and if P(A)=1-P(not A) then P(not A)=1-P(A).=1-1/6=5/6.

So the probability to get two of six AND something different than a six would be 1/6*1/6 multiplied by 5/6, i.e. p^{r}*(1-p)^{n-r}. Because the number of ways that the elements of a set can be arranged are proportional with the number of elements in the set (n=3) and also with the length of the subsets (r), we would expect to get many equivalent subsets so so we should multiply our result by the number of arrangements n=3 dices can be combined in pairs of r=two (like in a "two of six"), i.e. multiply by 3C2.

So finally the probability to get two of six and one of something else than six is 3C2*p^{r}*(1-p)^{n-r}=3!/(2!*(3-2)!)*(1/6)^{2}*(1-1/6)^{3-2}=3*5/6^{3}=0.0694 i.e. 6.94%.

#### Conclusion

The probability that an event A succeed r-times from n-experiments is

where p is the probability that the event A occur and (1-p) is the probability that the event A will not occur.

## Probability of 'at least'

Sometimes we are interested in counting the probability that an event will occur at least N-times. At least is the same as minimum. The probability P that some event A succeed minimum N times is equivalent with:

P(A succeed min N times)=1-P(A succeed max N-1 times). **This is very useful!**

## Probability of 'at most'

Sometimes we are interested in counting the probability that an event will occur at most N-times. At most is the same as maximum. The probability P that some event A succeed maximum N times is equivalent with:

P(A succeed max N times)=1-P(A succeed min N+1 times). **This is very useful!**

## Combining different events

Most of time you will have to count the number of occurrence of a composite event and not just a single isolated event. This will make the thing little bit complicated so it would be far easier if you can split the composite event in many isolated single events that are easy to count individually. For instance, you want to count in how many ways you can roll two dices such that the sum of the dices is a odd number. You can create a 6x6 matrix like the one below where we mark with red those events where the sum of the outcomes is a odd number:

6 |
7 |
8 | 9 |
10 | 11 |
12 |

5 |
6 | 7 |
8 | 9 |
10 | 11 |

4 |
5 |
6 | 7 |
8 | 9 |
10 |

3 |
4 | 5 |
6 | 7 |
8 | 9 |

2 |
3 |
4 | 5 |
6 | 7 |
8 |

1 |
2 | 3 |
4 | 5 |
6 | 7 |

1 |
2 |
3 |
4 |
5 |
6 |

We can count these events and we see that there are 18 in total. With two dices we get a 2D matrix of events. With three dices we get a 3D matrix of events. With 4 dices we get a 4D matrix of events. Can we draw such a matrix? What about a N-dimension matrix? So there are situations where it is impractical (if not impossible) to count manually. For such problems it would be far easier to use a divide and conquer strategy by breaking down the problem into two or more sub-problems of the same (or related) type, until these become simple enough to be solved directly.

## Mathematical operations between events

#### Independent events

If two events A and B are independent then probability that both events happen is

for example, if you roll two dices the chance of getting six with both of them is

#### Mutually exclusive events

If either event A OR event B OR both events occur on a single performance of an experiment this is called the union of the events A and B denoted as . The probability that an event A OR an event B occurs is

For example, the chance of getting a 6 OR 1 on a six-sided die is

#### Not mutually exclusive events

If the events are not mutually exclusive then

For example, when drawing a single card at random from a regular deck of cards, the chance of getting a heart or a face card (J,Q,K) (or one that is both) is , because of the 52 cards of a deck 13 are hearts, 12 are face cards, and 3 are both: here the possibilities included in the "3 that are both" are included in each of the "13 hearts" and the "12 face cards" but should only be counted once.

#### Conditional probability

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written , and is read "the probability of A, given B". It is defined by

If P(B)=0 then is formally undefined by this expression. However, it is possible to define a conditional probability for some zero-probability events using a Ïƒ-algebra of such events (such as those arising from a continuous random variable)

Now, if you think that this article was interesting don't forget to rate it. It shows me that you care and thus I will continue write about these things.

#### Eugen Mihailescu

#### Latest posts by Eugen Mihailescu (see all)

- Dual monitor setup in Xfce - January 9, 2019
- Gentoo AMD Ryzen stabilizator - April 29, 2018
- Symfony Compile Error Failed opening required Proxies - January 22, 2018