Question #1: 

Consider an algorithm that processes images. Let us say you are 
interested in  the relationship between 'processing time' and 'image
complexity'. You would like to develop a regression model to predict
'processing time' from 'image complexity' but, in particular, you 
would like to be able to address a claim made by the designer of the
algorithm that for unit increase in complexity, the algorithm requires
only 3ms more of processing time. You believe that the algorithm
is slower than claimed. To investigate you select a random sample of 
100 images from a corpus of images and compute the following sample 
statistics:

      processing time: mean=460ms; standard deviation=100ms
      image complexity: mean=90; standard deviation=20
      correlation between 'processing time' and 'image complexity' 
      is 0.8

Solution:

First, assume that SLR is appropriate. Next, you must derive the slope 
(b1) of the regression equation to predict 'processing time' from 
'image complexity':

      b1=0.8(100/20)=4

step 1: The null and alternative hypotheses are:
      H0: beta1=3 
      Ha: beta1>3

Since n is large, then you know that Theorem 2.1 part 1 applies. The rest of
the four step hypothesis testing procedure follows:

step 2: a) n=100, b1=4, s(y)=100, s(x)=20, r=0.8
        b) b1=4>3 hence b1 consistent with Ha

step 3: a) Assume H0 true (beta1=3)
        b) P-value:
           Since n=100 is large then b1 is normally distributed and:
                 mu(b1)=beta1=3
                  s(b1)=s(epsilon)[sqrt(1/(n-1)s^2(x)]
           Now:
             s(epsilon)=s(y)sqrt(1-r^2)=100sqrt(1-0.8^2)
                       =60
           and so:     
                  s(b1)=60[sqrt(1/(99)20^2]
                       =0.3015
           Hence:
                      z=(4-3)/0.3015=3.317
           Therefore the p-value is 0.0005 or 0.05%

step 4: This p-value is highly significant and so we reject H0 
        and conclude that the change in processing time for unit 
        increase in complexity is greater than 3ms which means 
        that the algorithm is slower than claimed.
       

Question #2: 

Given the statistics obtained for your sample complete the following:

a) Derive, state, and interpret the regression equation. Assume that
   a complexity score of zero means that the image is monochromatic 
   i.e. zero complexity).
b) Predict processing time for an image that has an 'image complexity' 
   score of 150.
 
Solution:

a) From above, b1=4 and so b0 is:
   
      b0=460-4(90)=100

   Hence the regression equation is:

      processing time=100+4(image complexity)

   SLOPE: For unit increase in 'image complexity' we can expect 
          'processing time' to increase by 4ms.
   INTERCEPT: When 'image complexity' is zero (i.e. monochromatic) the
          'processing time' is 100ms. This could be meaningful and may
          represent the overhead (i.e. of 100ms) in processing an image.

b) Merely plug the 'image complexity' score into the regression equation:    
       processing time = 100 + 4(150)
                       = 100 + 600
                       = 700ms  
 
   
Question #3:

A colleague contends that if you examine the relationship between work
experience prior to graduation and starting salary then you will find
that an increase of one month in work experience will lead to an increase
in mean starting salary of $1800. You disagree and believe that the
increase will be much less. You decide to investigate and decide to 
interview a random sample of 100 CTI-02 graduates. You discover the 
following:

  starting salary: mean = $60,000; std dev = $10,000 
  work experience (months): mean = 20; std dev = 5 
  correlation between salary and experience is 0.8

Considering the regression equation to predict starting salary from
work experience and assuming that SLR is appropriate answer the following:

a) Derive the equation of the regression line. 
b) State the null and alternative hypotheses to address the contending
   points of view stated above.
c) Determine the p-value for your hypotheses.

Solution:

a) Since b1=0.8(10000/5)=1600; b0=60000-1600*20=28000. Hence the regression 
   equation is:
         salary=28000+1600(experience)

b) H0:beta1=1800
   Ha:beta1<1800

c) step 2: a) n=100, b1=1600, s(y)=10000, s(x)=5, r=0.8
           b) b1=1600<1800 hence b1 consistent with Ha

   step 3: a) Assume H0 true (beta1=1800)
           b) P-value:
              Since n=100 is large then b1 is normally distributed and:
                 mu(b1)=beta1=1800
                  s(b1)=s(epsilon)[sqrt(1/(n-1)s^2(x)]
              Now:
                  s(epsilon)=s(y)sqrt(1-r^2)=10000sqrt(1-0.8^2)
                            =6000
              and so:     
                  s(b1)=6000[sqrt(1/(99)5^2]
                       =120.6045
              Hence:
                      z=(1600-1800)/120.6045=-1.66
              Therefore the p-value is 0.0485 or 4.85%

   step 4: This p-value is significant and so we reject H0 and
           conclude that the increase in mean starting salary
           for an increase in work experience of one month is
           less than $1800.