Hypothesis Testing ( my )

Readings:
  1. Ott; 5.4, 5.7-5.8
Recall:

The hypothesis testing problem involves two contrasting points of view (i.e. hypotheses) about a population. The problem is to determine which hypothesis is more reasonable.

e.g. Consider the CTI-02 problem, where a journalist reports that on average CTI-02 graduates receive starting salaries of $50000 but the CTI Dean believes that CTI-02 graduates got better offers.

To determine which hypothesis is more reasonable we use a strategy that is similar to proof by contradiction. Remember that this is a technique where, if you wish to prove something false, you assume it to be true and then by a logically consistent argument see if you are led to a contradiction of the initial assumption. Similarly, for hypothesis testing, we will assume one hypothesis to be true, we will then select a random sample from the population in question and see if the sample provides evidence to refute the assumption. Bear in mind that hypothesis testing is based on sampling distribution theory.

Terminology:

For our purposes, a hypothesis is a statement about a population that may be expressed in terms of one or more population parameters.

e.g. my =50000; my >50000

Given any hypothesis testing problem you will always identify two hypotheses:

Null hypothesis:

Denoted by H0 and, for this class, will be of the following form:

H0: my = m

Note: This is the point of view of no change; the point of view of the skeptic; the point of view to be challenged.

For the CTI-02 problem: H0: my = 50000

Alternative hypothesis:

Denoted by Ha or H1 or Hr. It is sometimes referred to as the research hypothesis hence Hr. It will be of the following form:

Ha: my > m; Ha: my < m or Ha: my m

Note: This is the point of view of change; the point of view of the optimist; the point of view of the challenger.

For the CTI-02 problem: H0: my > 50000

Hypothesis Testing Procedure (my )

To conduct a hypothesis test for my you will follow the following four step procedure:

  1. Examine the problem and identify and state the null and alternative hypotheses.
  2. e.g. Let us say that after examination of a problem statement you identify the null hypothesis to be my = m and you think that my > m, then you would state this as follows:

    H0: my = m
    Ha:
    my > m

  3. Examine your sample and do the following:
    1. Determine the sample size n and the statistics ybar and sy
    2. Ensure that ybar is consistent with your alternative hypothesis (i.e. if my > m then ensure that ybar > m and if my < m then ensure that ybar < m).

    Note: An inconsistent ybar means that your sample does not support your alternative and so you cannot proceed.

  4. If ybar is consistent then do the following:
    1. Assume H0 true.
    2. Given that H0 is assumed true, determine the p-value.

    Note: The p-value is the proportion of samples of size n that would result in a ybar more extreme than the one observed if H0 is true. To do this you must consider the sampling distribution of ybar. Remember that the sample size determines the sampling distribution:

    1. n large:

      Since we assume H0 true then:

      mybar =my=m.

    2. Since
      sy is not known we estimate it with sy and so we estimate sybar by:
      sybar=sy/sqrt(n).

      Compute:
      z=(ybar - mybar)/sybar

      Depending on your alternative hypothesis, determine the desired proportion (i.e. the p-value).

    3. n small:
      1. Establish that y is normally distributed.
        i.e. Conduct a normality test if SAS is available. Otherwise, assume normality.
      2. If y is normally distributed then:

        Since we assume H0 true:

        mybar =my=m.

      3. Again,
        sy is not known but we can estimate it with sy and so we estimate sybar by:
        sybar=sy/sqrt(n).


        Since y is normally distributed then compute:
        t=(ybar - mybar)/sybar

        Depending on your alternative hypothesis, determine the desired proportion (i.e. the p-value).

      4. If y is NOT normally distributed then conduct a Wilcoxon sign test.

      Note: The z and t values computed for hypothesis testing problems are sometimes referred to as test statistics.

  5. Apply the following decision rule to your p-value.
    1. If p-value is <= 1% then the p-value is highly significant and so you reject H0.
    2. If p-value is <= 5% then the p-value is significant and so you reject H0.
    3. If p-value is > 5% then the p-value is non-significant and so you have insufficient evidence to reject H0.

 

Problem:

Consider the CTI-02 graduates problem. You select a sample of twenty-five from the graduating class and determine that the mean starting salary is $52K with a standard deviation of $4K.

Conduct a test of hypotheses.

Solution:

Applying the procedure:

  1. Identify and state the null and alternative hypotheses.
  2. H0: my = $50K
    Ha:
    my > $50K

  3. Examining the sample:
  1. Determine the sample size n and the statistics ybar and sy
  2. n=25; ybar=$52K and sy=$4K

  3. Ensure that ybar is consistent with alternative hypothesis:

    Since ybar>$50K then ybar is consistent with Ha and we may proceed.

  4. If ybar is consistent then:
    1. Assume H0 true (i.e. my = $50K).
    2. Given that H0 is true, determine the p-value.

      n small:

      Since we are not using SAS let us assume that y is normally distributed in order to proceed.

      mybar =my=50;

      sybar=sy/sqrt(n)=4/sqrt(25)=0.8;

      t=(52 - 50)/0.8=2.5

      hence the t test statistic is 2.5 and from the t-table with df=24 the desired proportion (i.e. the p-value) is between 0.5% and 1%.

  5. Apply the decision rule to your p-value.

    Since the p-value is <= 1% then the p-value is highly significant and so we reject H0 and conclude that the mean is higher than claimed (i.e. my > $50K).