Question #1: Consider an algorithm that processes images. Let us say you are interested in the relationship between 'processing time' and 'image complexity'. You would like to develop a regression model to predict 'processing time' from 'image complexity' but, in particular, you would like to be able to address a claim made by the designer of the algorithm that for unit increase in complexity, the algorithm requires only 3ms more of processing time. You believe that the algorithm is slower than claimed. To investigate you select a random sample of 100 images from a corpus of images and compute the following sample statistics: processing time: mean=460ms; standard deviation=100ms image complexity: mean=90; standard deviation=20 correlation between 'processing time' and 'image complexity' is 0.8 Solution: First, assume that SLR is appropriate. Next, you must derive the slope (b1) of the regression equation to predict 'processing time' from 'image complexity': b1=0.8(100/20)=4 step 1: The null and alternative hypotheses are: H0: beta1=3 Ha: beta1>3 Since n is large, then you know that Theorem 2.1 part 1 applies. The rest of the four step hypothesis testing procedure follows: step 2: a) n=100, b1=4, s(y)=100, s(x)=20, r=0.8 b) b1=4>3 hence b1 consistent with Ha step 3: a) Assume H0 true (beta1=3) b) P-value: Since n=100 is large then b1 is normally distributed and: mu(b1)=beta1=3 s(b1)=s(epsilon)[sqrt(1/(n-1)s^2(x)] Now: s(epsilon)=s(y)sqrt(1-r^2)=100sqrt(1-0.8^2) =60 and so: s(b1)=60[sqrt(1/(99)20^2] =0.3015 Hence: z=(4-3)/0.3015=3.317 Therefore the p-value is 0.0005 or 0.05% step 4: This p-value is highly significant and so we reject H0 and conclude that the change in processing time for unit increase in complexity is greater than 3ms which means that the algorithm is slower than claimed. Question #2: Given the statistics obtained for your sample complete the following: a) Derive, state, and interpret the regression equation. Assume that a complexity score of zero means that the image is monochromatic i.e. zero complexity). b) Predict processing time for an image that has an 'image complexity' score of 150. Solution: a) From above, b1=4 and so b0 is: b0=460-4(90)=100 Hence the regression equation is: processing time=100+4(image complexity) SLOPE: For unit increase in 'image complexity' we can expect 'processing time' to increase by 4ms. INTERCEPT: When 'image complexity' is zero (i.e. monochromatic) the 'processing time' is 100ms. This could be meaningful and may represent the overhead (i.e. of 100ms) in processing an image. b) Merely plug the 'image complexity' score into the regression equation: processing time = 100 + 4(150) = 100 + 600 = 700ms Question #3: A colleague contends that if you examine the relationship between work experience prior to graduation and starting salary then you will find that an increase of one month in work experience will lead to an increase in mean starting salary of $1800. You disagree and believe that the increase will be much less. You decide to investigate and decide to interview a random sample of 100 CTI-02 graduates. You discover the following: starting salary: mean = $60,000; std dev = $10,000 work experience (months): mean = 20; std dev = 5 correlation between salary and experience is 0.8 Considering the regression equation to predict starting salary from work experience and assuming that SLR is appropriate answer the following: a) Derive the equation of the regression line. b) State the null and alternative hypotheses to address the contending points of view stated above. c) Determine the p-value for your hypotheses. Solution: a) Since b1=0.8(10000/5)=1600; b0=60000-1600*20=28000. Hence the regression equation is: salary=28000+1600(experience) b) H0:beta1=1800 Ha:beta1<1800 c) step 2: a) n=100, b1=1600, s(y)=10000, s(x)=5, r=0.8 b) b1=1600<1800 hence b1 consistent with Ha step 3: a) Assume H0 true (beta1=1800) b) P-value: Since n=100 is large then b1 is normally distributed and: mu(b1)=beta1=1800 s(b1)=s(epsilon)[sqrt(1/(n-1)s^2(x)] Now: s(epsilon)=s(y)sqrt(1-r^2)=10000sqrt(1-0.8^2) =6000 and so: s(b1)=6000[sqrt(1/(99)5^2] =120.6045 Hence: z=(1600-1800)/120.6045=-1.66 Therefore the p-value is 0.0485 or 4.85% step 4: This p-value is significant and so we reject H0 and conclude that the increase in mean starting salary for an increase in work experience of one month is less than $1800.