Statistics notation

Introduction

Probability and statistics is replete with all sorts of strange notation. In this module, we try to clarify some notation that we use in other modules. In doing so, we provide a very brief outline of the foundations of probability and statistics.

The experimental setup

Every statistics problem begins with an experiment denoted $\mathcal{E}$. It can be someone flipping a coin, determining the time it takes for a cell to divide, or determining whether a certain drug is effective – it doesn’t matter.

Of course, every experiment $\mathcal{E}$ has an outcome. For example, when flipping a coin, there are two possible outcomes, heads $H$ and tails $T$. The collection of all possible outcomes of an experiment we denote $\mathcal{S}$ and call the sample space. Mathematically, $\mathcal{S}$ is a set. For example, in the case of flipping a coin, $\mathcal{S} = \{H, T\}$.

The definition of probability

Subsets of the sample space, i.e. collections of outcomes of the experiment $\mathcal{E}$, are called events. In most cases, it is not useful to simply assign every element of the sample space $s$ a probability. Instead, we usually

At this point a little set theory helps and sets the stage for all of probability theory. In this article we just give the basic idea; for a more advanced exposition, look for books on measure theoretic probability such as the Resnick’s A Probability Path or Billingsley’s Probability and Measure. These are both advanced texts and are only accessible with an undergraduate level of mathematics. The Wikipedia probability outline is also a helpful handy resource.

Onward! For any set $\mathcal{A}$, the power set of $\mathcal{A}$ is the set of all subsets of $\mathcal{A}$; it’s denoted $\mathcal{P}(\mathcal{A})$. For example, the subsets of $\mathcal{A}$ are $\mathcal{S} = \{H, T\}$, $\{H\}$, $\{T\}$, and $\emptyset = \{\}$, the so-called empty set, which is by definition a subset of any set (as is the set itself). So the power set is $\mathcal{P}(\mathcal{A}) = \big\{\{H,T\}, \{H\}, \{T\}, \{\}\big\}$. In general, if a set has $n$ elements, then its power set will have $2^n$ elements. In the coin flipping case, $\mathcal{S}$ has 2 elements, and the power set has $2^2 = 4$ elements.

We are now at a place where we can define a probability. A probability is a function, usually denoted $P$, which assigns to every element of the power set of the sample space a number. Of course, not just any function will do. The function $P$ must satisfy the three following properties to be a probability :

1. The probability of the sample space is 1 : $P(S) = 1$.
2. Probabilities can’t be negative : for any event $\mathcal{A} \in \mathcal{P}(\mathcal{S})$ $P(\mathcal{A}) \geq 0$.
3. If $\mathcal{A}$ and $\mathcal{B}$ are disjoint sets (they don’t contain any of the same elements), then $P(\mathcal{A} \cup \mathcal{B}) = P(\mathcal{A}) + P(\mathcal{B})$.

Subsets of the sample space, i.e. collections of outcomes of the experiment $\mathcal{E}$, are called events. In most cases, it is not useful to simply assign every element of the sample space $s$ a probability. Instead, we usually

At this point a little set theory helps and sets the stage for all of probability theory. In this article we just give the basic idea; for a more advanced exposition, look for books on measure theoretic probability such as the Resnick’s A Probability Path or Billingsley’s Probability and Measure. These are both advanced texts and are only accessible with an undergraduate level of mathematics. The Wikipedia probability outline is also a helpful handy resource.

Onward! For any set $\mathcal{A}$, the power set of $\mathcal{A}$ is the set of all subsets of $\mathcal{A}$; it’s denoted $\mathcal{P}(\mathcal{A})$. For example, the subsets of $\mathcal{A}$ are $\mathcal{S} = \{H, T\}$, $\{H\}$, $\{T\}$, and $\emptyset = \{\}$, the so-called empty set, which is by definition a subset of any set (as is the set itself). So the power set is $\mathcal{P}(\mathcal{A}) = \big\{\{H,T\}, \{H\}, \{T\}, \{\}\big\}$. In general, if a set has $n$ elements, then its power set will have $2^n$ elements. In the coin flipping case, $\mathcal{S}$ has 2 elements, and the power set has $2^2 = 4$ elements.

We are now at a place where we can define a probability. A probability is a function, usually denoted $P$, which assigns to every element of the power set of the sample space a number. Of course, not just any function will do. The function $P$ must satisfy the three following properties to be a probability :

1. The probability of the sample space is 1 : $P(S) = 1$.
2. Probabilities can’t be negative : for any event $\mathcal{A} \in \mathcal{P}(\mathcal{S})$ $P(\mathcal{A}) \geq 0$.
3. If $\mathcal{A}$ and $\mathcal{B}$ are disjoint sets (they don’t contain any of the same elements), then $P(\mathcal{A} \cup \mathcal{B}) = P(\mathcal{A}) + P(\mathcal{B})$.

Subsets of the sample space, i.e. collections of outcomes of the experiment $\mathcal{E}$, are called events. In most cases, it is not useful to simply assign every element of the sample space $s$ a probability. Instead, we usually

At this point a little set theory helps and sets the stage for all of probability theory. In this article we just give the basic idea; for a more advanced exposition, look for books on measure theoretic probability such as the Resnick’s A Probability Path or Billingsley’s Probability and Measure. These are both advanced texts and are only accessible with an undergraduate level of mathematics. The Wikipedia probability outline is also a helpful handy resource.

Onward! For any set $\mathcal{A}$, the power set of $\mathcal{A}$ is the set of all subsets of $\mathcal{A}$; it’s denoted $\mathcal{P}(\mathcal{A})$. For example, the subsets of $\mathcal{A}$ are $\mathcal{S} = \{H, T\}$, $\{H\}$, $\{T\}$, and $\emptyset = \{\}$, the so-called empty set, which is by definition a subset of any set (as is the set itself). So the power set is $\mathcal{P}(\mathcal{A}) = \big\{\{H,T\}, \{H\}, \{T\}, \{\}\big\}$. In general, if a set has $n$ elements, then its power set will have $2^n$ elements. In the coin flipping case, $\mathcal{S}$ has 2 elements, and the power set has $2^2 = 4$ elements.

We are now at a place where we can define a probability. A probability is a function, usually denoted $P$, which assigns to every element of the power set of the sample space a number. Of course, not just any function will do. The function $P$ must satisfy the three following properties to be a probability :

1. The probability of the sample space is 1 : $P(S) = 1$.
2. Probabilities can’t be negative : for any event $\mathcal{A} \in \mathcal{P}(\mathcal{S})$ $P(\mathcal{A}) \geq 0$.
3. If $\mathcal{A}$ and $\mathcal{B}$ are disjoint sets (they don’t contain any of the same elements), then $P(\mathcal{A} \cup \mathcal{B}) = P(\mathcal{A}) + P(\mathcal{B})$.