Hypothesis Testing

Contents

Objective  The Scientific Method  Significance Testing  Acceptance/Rejection Testing  Examples 

^ Learning Objectives

The Hypotheis Testing module targets the following cognitive tasks:

Task        Skills Concepts
HT-1: Understand the difference between descriptive and inferential statistics
HT-2: Understand the steps of a hypothesis test
HT-3: Construct the null hypothesis Understand the nature of the null hypothesis
HT-4: Construct the alternative hypothesis Understand the nature of the alternative hypothesis
HT-5: Choose an appropriate significance level Understand the significance level
HT-6: Describe the Type I error in the context of a hypothesis test Understand the Type I error
HT-7: Describe the Type II error in the context of a hypothesis test Understand the Type II error
HT-8: Understand the purpose of the test statistic
HT-9: Understand how to make a decision (classical and p-value methods)
HT-10: Understand what the conclusion should contain
HT-11: Understand the concept of "power" of a hypothesis test
HT-12: Understand the difference between one- and two-sided Ha
HT-13: Understand the relation between two-sided hypothesis tests and confidence intervals

^ The Scientific Method

Statistical inference is concerned with making statements about the population based on information in the sample. There are two principal ways of making statistical inferences: confidence interval estimation, which was presented earlier, and hypothesis testing. This module focuses on hypothesis testing.

Hypothesis testing, and more generally statistics, is at the heart of the scientific method. Thus, the scientific method is reviewed from a statistical perspective. This provides a frame of reference for the steps comprising hypothesis testing.

The steps underlying the scientific method are:

  1. State the Problem: the problem to be solved
  2. Formulate the Hypotheses: the specification of the probability models for the cases in which the experimenter's speculation is true (the experimental or research hypothesis) or not true (the null hypothesis)
  3. Design the Experiment or Survey: the variables of interest, the sampling method, e.g., probability sampling, the methods of controlling variability, e.g., blocking, the method of applying treatments to observations, e.g., randomization
  4. Make Observations: measurements consisting of the true value, the bias (minimized by the experimental technique), and the chance error (minimized by the design)
  5. Analyze and Interpret the Data: develop a test statistic, test the hypothesis, and probabilistically assess the null hypothesis
  6. Draw Conclusions: reject or fail to reject the null hypothesis

A scientific hypothesis must be testable, i.e., a test statistic T must be available that can distinguish between the null and research hypothesis. A hypothesis can never be proven to be true. However, we can make probability-based statements about the efficacy of the hypothesis.

^ Significance Testing

Hypothesis tests are of two types: significance tests and acceptance/rejection tests. The idea behind significance tests are presented first in the context of an example.

Consider the claim that the breast cancer rate among Puerto Rican-born women living in New York is higher than the Puerto Rican breast cancer rate known to be 0.0035. In terms of the scientific method, we can formulate the problem by:

Research Hypothesis:
The breast cancer rate of Puerto Rican-born women living in New York is higher than the Puerto Rican breast cancer rate.
Chance Explanation:
The observed difference in breast cancer rate is due to chance.

We state these hypotheses in terms of the population proportion p of Puerto Rican-born women living in New York who have breast cancer. Specifically,

HA: p > 0.0035
Ho: p = 0.0035
The research hypothesis, denoted by HA, is called the alternative hypothesis, whereas the chance explanation, denoted by Ho, is called the null hypothesis.

The experiment consists of drawing n = 10,500 Puerto Rican-born women at random from the population of Puerto Rican women living in New York City (N = 200,000) and determinging for each woman whether or not she has breast cancer. Since the sample size n is small (approximately 5%) in comparison with the population size, the binomial probability model is a good approximation to the actual hypergeometric sampling model. Under the null hypothesis, the binomial probability model is completely specified, i.e., if X represents the number of Puerto Rican-born women in the sample with breast cancer, then . On the other hand, a family of binomial probability models, one for each p > 0.0035, is specified under the alternative hypothesis, i.e., for p > 0.0035.

Suppose 62 women in the sample have breast cancer. This is higher than the number expected to have cancer if the null hypothesis is true, i.e., E(X) = np = 10,500 (0.0035) = 36.75 under Ho. Does this invalidate the null hypothesis or is it possible that a value of x = 62 or greater could have occurred under Ho with reasonable probability?

A natural test statistic is T = X = number of women in the sample with breast cancer. If T is large (relative to the expected value under Ho), then the evidence mounts against the null hypothesis. In particular, we are interested in the probability that:

.
If this probabilty is small, then the observed T is in the right-hand tail of the binomial distribution and it is unlikely to have occurred if Ho is true. This probability is called the P-value or the observed significance level.

In this example, we need to compute:

.
This probability can be computed exactly using the binomial probability density or approximately by the normal probability density (using the central limit theorem). We will use the binomial distribution, but the probabilities are difficult to compute since n is large and p is small. Therefore, we will use the five-step to simulate the P-value.

^ Acceptance/Rejection Testing

^ Examples

Example #1

Self-test