# Regression with a binary response

#### Predicting a yes/no type (binary) response given a continuous predictor.

The instance below illustrates regression several common generalized linear regression models (GLMs) with a continuous predictor $X$ and binary response $Y$. The most important example is that of logistic regression, which uses the logit link function in the GLM. The logit function is defined $\mbox{logit}(x) = \log\left(\frac{x}{1-x}\right)$.

# Facilitated elicitation of the beta distribution

#### This module was presented at the 2013 Joint Statistical Meetings.

The module above is intended to help statisticians and experts communicate better. It is specifically designed with the problem of the prior elicitation of a population proportion in mind (e.g. the binomial parameter π, or p), using the mode/percentile method. This module is still in its design stage. Like all BaylorISMs modules, it comes with absolutely no guarantee.

The slides from the 2013 JSM talk concerning this module are now posted! You can find them either below OR here for higher resolution.

# GRE Scores (and GMAT, MCAT, and LSAT scores!)

#### How’d I do? An overly simplified look at GRE scores.

[Note : This module may take a little while to load, so be patient!]

When preparing for the GRE it can be hard to figure out what all the numbers mean. We’re used to seeing scores like 93 (or maybe more like 63!), B–, or 3.2, and, over the course of growing up, we kind of have a feel for what those mean.

But the GRE’s not like that.

In a way, that’s because it’s very hard to figure out what’s a hard question and what’s an easy one. Think about it : the people writing the GRE have spent years understanding the topics that they are writing about, kind of like you’ve spent years spelling. Imagine how difficult it would be to pick a word which is just hard enough to spell that not everyone can spell it, but some people can. Or an easy word : a word that exactly 15% of people can spell. Not too hard? Then think about having to pick a whole ton of words of varying degrees of difficulty so that you can figure out just how good a speller someone is. Now you’ve gotta find a hundred words that exactly 77% of people (say) can spell correctly. See the problem?

Very roughly speaking, the way the GRE makers do this is simply by writing questions (not thinking about their exact difficulty) and putting them on experimental sections of tests that aren’t graded. Then they can look at how many people get those right/wrong, and try to figure out a good collection of questions.

Unfortunately, that’s not all they do. Once they give the tests, they then standardize the grades by looking at how well everyone else taking the test did. Some people who took the SAT several times (which is made similarly) will tell you that one of the times they took the exam was much harder than another (or easier); they may be right. Some GREs are harder than others. To make up for it, the grading system tries to put everyone on one common funky scale, regardless of how hard the test was.

Specifically, they aim to scale the test so that the grades form a specific distribution; that’s why when you see the test, it’s no good seeing the score itself—you have to know what percentile that score corresponds to. The wikipedia article on the subject gives tables for each section: quantitative, verbal, and writing.

It turns out that each of these distributions is more or less unique, but many of them are roughly normal (i.e. they can be approximated by a bell curve). We can reverse engineer the data from Wikipedia to find good approximations to the distribution they use. These are the distributions below. Enjoy!

By setting the Max value to a score (and leaving Min at the bottom), you can compute the percentile of a certain score.

The GMAT is composed of four parts – integrated reasoning, analytical writing, quantitative and verbal. Using the data from mba.com, we again fit distributions to the percentiles. It turns out only the Verbal distribution is very normal; the analytical writing distribution and the quantitive distribution are particularly bad. To account for this, we model the former using a beta distribution, and continue to model the others with mixture normals.

## And the MCAT scores?

The MCAT has three sections – physical sciences, biological sciences, and verbal reasoning – which each range in scores from 1 to 15. Using the 2012 data from the AAMC, the distributions look like…

## The LSAT Scores

Using the LSAT score percentile data from alphascore.com, the distribution looks quite bell shaped.

# Linear regression

One of the most common statistical methods is linear regression.

# Random triangles

At a basic level, a random triangle is simply a triangle whose corners are three random points on a piece of paper.

Mathematically speaking, a few decisions have to be made characterize exactly how the random point selection works. Think of it this way : should every place on the piece of paper be equally likely, or should the middle of the page be more likely to be selected than near the borders?

In this module, we assume that the points are coming from a bivariate normal distribution with unit variances and correlation $\rho$.

## Play with random triangles!

The following module generates bunches of random triangles using the bivariate normal distribution with correlation coefficient $\rho$. The red triangles are obtuse, and the green triangles are acute (the likelihood of seeing a right triangle is 0, so it doesn’t get a color.) You can change $\rho$ with the slider under the module. What happens as $\rho$ approaches -1 or 1?

In Professor Strang’s lecture he discusses what the triangles look like in “triangle space”. The basic idea is that every triangle has three angles which sum to $180^{\circ}$, call them $\alpha$, $\beta$, and $\gamma$. Every triangle is therefore represented by a single point in the “triangle space”. Further, the triangle space itself can be broken into four regions.

In the diagram below, the regions of the triangle are colored according to the kinds of triangles which are “zoned” to those regions : the red regions represent obtuse triangles, and the green region represents acute triangles. Notice that as $\rho$ approaches -1 or 1, all of the triangles get pulled towards the corners. Why?

# Statistics notation

### Introduction

Probability and statistics is replete with all sorts of strange notation. In this module, we try to clarify some notation that we use in other modules. In doing so, we provide a very brief outline of the foundations of probability and statistics. We do this at various levels of mathematical sophistication. Feel free to peruse the levels to find the one which best fits where you’re at.

### The experimental setup

Every statistics problem begins with an experiment denoted $\mathcal{E}$. It can be someone flipping a coin, determining the time it takes for a cell to divide, or determining whether a certain drug is effective – it doesn’t matter.

Of course, every experiment $\mathcal{E}$ has an outcome. For example, when flipping a coin, there are two possible outcomes, heads $H$ and tails $T$. The collection of all possible outcomes of an experiment we denote $\mathcal{S}$ and call the sample space. Mathematically, $\mathcal{S}$ is a set. For example, in the case of flipping a coin, $\mathcal{S} = \{H, T\}$.

### The set-theoretic foundations of probability

Subsets of the sample space, i.e. collections of outcomes of the experiment $\mathcal{E}$, are called events. In most cases, it is not useful to simply assign every element of the sample space $s$ a probability. Instead, we usually

At this point a little set theory helps and sets the stage for all of probability theory. In this article we just give the basic idea; for a more advanced exposition, look for books on measure theoretic probability such as the Resnick’s A Probability Path or Billingsley’s Probability and Measure. These are both advanced texts and are only accessible with an undergraduate level of mathematics. The Wikipedia probability outline is also a helpful handy resource.

Onward! For any set $\mathcal{A}$, the power set of $\mathcal{A}$ is the set of all subsets of $\mathcal{A}$; it’s denoted $\mathcal{P}(\mathcal{A})$. For example, the subsets of $\mathcal{A}$ are $\mathcal{S} = \{H, T\}$, $\{H\}$, $\{T\}$, and $\emptyset = \{\}$, the so-called empty set, which is by definition a subset of any set (as is the set itself). So the power set is $\mathcal{P}(\mathcal{A}) = \big\{\{H,T\}, \{H\}, \{T\}, \{\}\big\}$. In general, if a set has $n$ elements, then its power set will have $2^n$ elements. In the coin flipping case, $\mathcal{S}$ has 2 elements, and the power set has $2^2 = 4$ elements.

We are now at a place where we can define a probability. A probability is a function, usually denoted $P$, which assigns to every element of the power set of the sample space a number. Of course, not just any function will do. The function $P$ must satisfy the three following properties to be a probability :

1. The probability of the sample space is 1 : $P(S) = 1$.
2. Probabilities can’t be negative : for any event $\mathcal{A} \in \mathcal{P}(\mathcal{S})$ $P(\mathcal{A}) \geq 0$.
3. If $\mathcal{A}$ and $\mathcal{B}$ are disjoint sets (they don’t contain any of the same elements), then $P(\mathcal{A} \cup \mathcal{B}) = P(\mathcal{A}) + P(\mathcal{B})$.

# Tabs test

This is the basics module.

Morbi tincidunt, dui sit amet facilisis feugiat, odio metus gravida ante, ut pharetra massa metus id nunc. Duis scelerisque molestie turpis. Sed fringilla, massa eget luctus malesuada, metus eros molestie lectus, ut tempus eros massa ut dolor. Aenean aliquet fringilla sem. Suspendisse sed ligula in ligula suscipit aliquam. Praesent in eros vestibulum mi adipiscing adipiscing. Morbi facilisis. Curabitur ornare consequat nunc. Aenean vel metus. Ut posuere viverra nulla. Aliquam erat volutpat. Pellentesque convallis. Maecenas feugiat, tellus pellentesque pretium posuere, felis lorem euismod felis, eu ornare leo nisi vel felis. Mauris consectetur tortor et purus.