Share on Twitter
Share on Facebook
Share on LinkedIn
# Introduction

# Axioms of Probability

# Consequences of Kolmogorov's Axioms of Probability

# Conclusion

This series of tutorials is an introduction to the basic statistics used in machine learning algorithms.

Definitions: let (\(\Omega\) , F, P) be a measure space with P(\(\Omega\)) = 1. This means that (\(\Omega\) , F, P) is a probability space where \(\Omega\) = sample space, F = event space and P = Probability measure.

In Kolmogorov's probability theory the probability P of some event E must satisft 3 axioms.

**P(E)****\(\in\)\(R\) , P(E) \(\geq\)****0,\(\forall\) E \(\in\) F**. This means that the probability of some event occurring must be a real positive number for all event E in the event space (all possible events).**P(****\(\Omega\)) = 1**. This means that the probability that an event from the sample space occurring is 1 (some event being measured must occur).**P(\(\cup_{i=0}^{\infty} E_i)\) = \(\sum_{i=0}^{\infty} P(E_i)\)**. What this means is that each event \(E_i\) must be mutually exclusive to each other.

**Probability of Empty Set:**

P(\(\emptyset\)) = 0

**Monotonicity:**

A \(\subseteq\) B then P(A) \(\subseteq\) P(B)

**Numeric Bound:**

0 \(\leq\) P(E) \(\leq\) 1

**Probability of Union of Events**

P(A \(\cup\) B) = P(A) + P(B) - P(A \(\cap\) B)

It can also be shown as a consequence of axiom 3 the following:

P(A \(\cup\) B) = P(A) + P(B \ (A \(\cap\) B)). In other words the event A or B can be described as the event A occuring or only B occuring.

**Principal of Inclusion Exclusion**

P(\(A^c\)) = P(\(\Omega\)\A) = 1 - P(A)

This is just a very basic introduction to statistics and will be the foundation on which further theorems will build upon. I highly recommend the reader to proof the consequences outlined above. If you have any question please leave a comment below.