# Statistics, a lecture by Ricardo Nirenberg. Spring 1997, the University at Albany, Project Renaissance.

There should be no need to emphasize the importance of statistics in our society. The President is now in trouble for having pressed and accepted certain legally dubious monetary contributions; that money was mostly spent paying for pre-election polls—finding out what the public wanted to hear—, in other words, for statistics. Statistics doesn't just get presidents in trouble, however; it is as essential for everyday policy or business decisions as it is for natural science. Whenever there are too many people, too many particles, too many things or events, so that it would be nearly impossible to find out the state or behavior of each individual or the outcome of each event, we have recourse to statistics.

We will concentrate on one problem here, and although it may seem to you to be a very special one, it contains all the basic elements that go into all practical decisions based on statistics. This problem was mentioned in my previous lecture: to decide whether a coin is fair or not. Suppose we flip the coin 1,000 times and we get 485 heads and 515 tails: does the coin seem fair? Now suppose we got instead 400 heads and 600 tails: what then? Needless to say, it is extremely unlikely that out of 1,000 flips we would get exactly 500 heads and 500 tails, even if the coin were fair; on the other hand, getting too many heads, say 988, would be very strong evidence that the coin is not fair: where do we draw the line? What would be a number of heads above which we would conclude that the coin is not fair? This is a typical statistical problem. The coin turning up heads or tails is the result of too many and too tiny physical events for us to be able to answer the question as to the fairness of the coin in any other, non-statistical, way.

Statistics operates as follows: we assume that the coin is fair (this is only an assumption!), in which case we have that the expected number of heads is half the number of tosses. Now we toss the coin N times (before N was 1,000 but of course it doesn't have to be). Most of the time, especially if N is high, we will NOT actually get N/2 heads exactly; we will get a number of heads, N(H), which will be different from N/2. The important thing is to keep track of the difference; if it is "too high" we will conclude that the coin is not fair, and if it is not "too high" we will conclude—what? well, merely that there is not enough experimental evidence to brand the coin as unfair (which is not the same as declaring it fair!) The main problem is to clarify what we mean by "too high."

To do that, we will find out what should be expected about the difference N(H) - N/2. How high should we expect it to be, assuming the coin is fair? Note that this difference can be positive or negative, so when I say "how high" I mean either high toward the positive side or toward the negative side. Let us consider the following graph (which is technically known as a "random walk"): on the horizontal axis we will label the tosses by number, from 0 to N, and we start drawing a broken line, starting at 0: each time the coin comes up heads we draw a line going up one unit, each time the coin comes up tails we draw a line going down one unit. We get a broken line which after N tosses of the coin will end up at a certain distance from the horizontal axis; we call this distance D(N) and we have: D(N) = N(H) - N(T), the number of heads minus the number of tails. Of course, D(N) can be positive or negative, according to whether we get more heads or more tails, so it is more convenient to look at the square of D(N), which will never be negative. Now since the first flip of the coin must result in our line either going up one unit or down one unit, we have D(1) = +1 or -1, hence D2(1) = 1. On the other hand, when we look at the distance from the horizontal after n+1 flips, it must be equal to the distance achieved after n flips plus or minus one unit, so: either D(n+1) = D(n) + 1 or D(n+1) = D(n) - 1. Squaring all sides we get: D2(n+1) equals either [D(n)+1]2 or [D(n)-1]2, each alternative having probability 1/2. Expanding both squares we get that D2(n+1) is either D2(n) + 2D(yn) + 1 or D2(n) - 2n + 1. Since each of these has probability 1/2, to find what we expect we must do as in our previous lecture, multiply each of those two quantities by 1/2 and add them, and so we get that the expected value of D2(n+1) is equal to the expected value of D2(n) + 1. This means that as we increase the number of tosses of the coin by one unit, the expected value of the square of the distance also goes up by one unit. Since for one toss D(1) was equal to 1, we see that finally the expected value of D2(N) is N.

This very important result can be expressed a little differently by taking square roots; we may say that the expected value for the distance itself, D(N), is minus or plus the square root of N. Remember at the beginning we were really looking not for the expected value of D(N) = N(H) - N(T), number of heads minus number of tails, but rather for the expected value of N(H) - N/2, the difference between our actual number of heads and half the number of tosses. But there's an obvious relation between these two. Since N(H) + N(T) = N (the number of heads plus the number of tails equals the number of tosses), doing a little simple algebra we get that N(H) - N/2 = D(N)/2. So finally the expected value of N(H) - N/2 turns out to be plus or minus (1/2)square root(N). This number has a technical name: it's called the standard error of our coin-tossing experiment. So the standard error is half the square root of the total number of tosses.

So let us look back at the numbers I gave you before. The first question was: suppose we get 485 heads and 515 tails: does the coin seem fair? Well, 485 heads is within the range 484 to 516, so there's nothing exceptional about our results. For all we know, the coin may be fair. But the second question was: suppose we got 400 heads and 600 tails: what then? Now 400 heads is 100 units away from the average 500, and since each SE is 16, our number of heads is about 6 standard errors away from 500. The probability of getting a result 6 SEs away from the average (assuming the coin was fair) is smaller than the chance of winning the lottery, it is fantastically low. And so the statistician reasons as follows: under the assumption that the coin was fair, we got a fantastically unlikely event, almost a miracle; rather then believe in miracles, I prefer to reject the original assumption that the coin was fair, so I pronounce it not fair (it is more likely to land tails than heads). Statisticians call this second case "statistically significant," meaning it allows them to reject the original hypothesis and say the coin is not fair. The first case, when we got 485 heads, would not be statistically significant, for it allows them to conclude nothing.

There are subtle questions here. In the two numerical examples I just gave you, the situation was clear-cut, but suppose now the number of heads is borderline. Suppose we get 468 heads. We reason as follows: 468 is 32 units away from 500, now 32 units means two SEs (one SE being 16), so let's look at the probability of our coin deviating from the expected 500 heads by 32 units or more: it turns out (from the bell curve) to be about 5%, or 0.05. An event with probability 0.05 can be expected to happen 5 times out of 100. You can't quite call it a miracle, but is that probability small? It all depends on the real-life situation. A statistician cannot tell you whether 5% is small or not, all he can tell you is: "Assuming your coin was fair, the probability of getting the kind of result we got (we got 468 heads) is about 5%." It's up to you then to decide whether or not 5% is small enough to warrant the conclusion that the coin is not fair. To consider 5% as small, or to say no, we'll consider small anything below 1%, or below 0.1%, or what have you, is always a policy decision, not a statistician's.

Let's go back to our standard error. Remember that it was the expected difference between the number of heads in N tosses of a fair coin and N/2 (how many heads one expects). This expected difference turned out to be (1/2) square root(N). So if we toss a fair coin 100 times we may expect the number of heads to differ from 50 by about plus or minus 5; if we toss a fair coin 10,000 times we may expect the number of heads to differ from 5,000 by about plus or minus 50, and so on. Notice that the more times we toss the coin, the higher will the expected difference be. So it is NOT true that the more we toss the coin we can expect the number of heads to be closer to half the number of tosses. The truth is contained in the LAW OF LARGE NUMBERS, which says that the RELATIVE difference, that is, the expected difference between the number of heads and half the number of tosses DIVIDED by the total number of tosses, [N(H) - N/2]/N, will become closer and closer to 0 as we increase N. This fundamental result, without which all statistics would be impossible, is due to the fact that the numerator, the expected difference N(H) - N/2, is the standard error, so it is 1/2 times the square root of N, and that square root saves the day! For we have (1/2)square root(N)/N = (1/2)1/square root(N), and 1/square root(N) gets closer and closer to 0 as N gets large.

Let's see how this works with polls. Suppose you are working for Teresita Balboa, running for the Senate from New York, and you want to poll the voters to predict the outcome; but Ms Balboa is running on a shoestring, so you decide to take a sample of just 100 voters in the state, and ask them how they are going to vote. Several comments here: first, polling people is expensive, and that's why no one ever polls all the voters; the larger the sample, the more expensive the procedure will be. Secondly, the way one chooses the sample is crucial; the most famous cases where polls went wrong (Roosevelt vs Landon, Truman vs Dewey) were based on faulty samples. What is a faulty sample, or in technical terms, a biased sample? If we take a sample of 100 voters in New York, it may be biased because it contains a disproportionate number of Republicans or Democrats. If we select our sample by looking a directory of a teachers' union, for example, we may be confident that we'll choose more Democrats than if we were to select our sample by looking at the membership roster of the most expensive golf club; indeed, we would get too many Democrats compared to the whole state, which will make our predictions unreliable. How to choose samples is a difficult art or science, an important part of statistics, but we have no time to dwell on it.