Hypothesis Testing - Proportions:

Problem #1:

An application development manager claims that 75% of programs in the
production portfolio are Y2K compliant. The EDP auditor disputes this
claim and believes that the proportion is less than claimed. She
examines 200 of these programs and discovers that 142 are compliant.

1.  	Give the appropriate null and one-sided research hypothesis
	that corresponds to the auditors belief.

		H0: pi = 0.75
		Ha: pi < 0.75

2.	Calculate the p-value for the hypotheses stated above.

	Note that in this case you have a large sample. Since the sample
	proportion is 142/200=0.71 and SD(p)=sqrt((0.75)*(0.25)/200)=0.031
	then assuming H0 true, the z value is:

	       z = (0.71-0.75)/SD(p)
		 = -1.3

	Hence the p-value is (100-80.64)/2=9.7 which is approximately 10%. 
	This is a p-value of 0.1


3.	Is this significant, highly significant, or non-significant?

		Non-significant.


4.	Comment on the analyst's belief.

	Since this is a non-significant result,  the EDP auditor has
	insufficient evidence to reject H0 and so has insufficient
     	evidence to challenge the managers claim.


Problem #2:

A production manager claims that 50% of disk drives coming off a production
line have faster seek times than stipulated in the specifications. The QA
manager believes that the actual proportion is less than claimed. She takes
a sample of 350 drives from a recent production run and finds that 102 are
faster.
 
1.  	Give the appropriate null and one-sided research hypothesis
	that corresponds to the auditors belief.

		H0: pi = 0.5
		Ha: pi < 0.5

2.	Conduct a test of hypotheses.

	Since p=102/350=0.29 and SD(p)=sqrt((0.5)*(0.5)/350)=0.027 then:

		z=(0.29-0.5)/0.027=-7.8

	The p-value is almost zero. This is a highly significant result and
	so the null hypothesis should be rejected. The QA manager therefore
	has enough evidence to support her belief that the true proportion
	is less than claimed.


Sample Size Determination:

Problem #1:

Estimate the sample size required to estimate the mean seek time of disk
drives from a production run to within 0.025ms with 99% confidence.
Assume that SD(y) is known to be 0.075 from a previous study.

Solution:

Since the required z value for a 99% CI is 2.6 and you know from CI
theory that the error in your estimate is z(99)*(SD(y)/sqrt(n)) then:

	0.025=2.6*(SD(y)/sqrt(n))
Rearranging terms:
	n=((2.6)*(0.075)/0.025)**2=60.84

Hence 61 disk drives would be required. 

Problem #2:

Estimate the sample size required to estimate the proportion of programs
in a portfolio that are Y2K compliant to an accuracy of 0.05 with 95%
confidence.

Solution:

Since an estimate of pi is not provided the worst that you can do is to
use 0.5. You know from CI theory that the error in your estimate is
z(95)*sqrt(pi*(1-pi)/n) and since z(95) is 1.95 then:

	0.05=1.95*sqrt(0.5*0.5/n)
Rearranging terms:
	n=(1.95**2)*(0.5*0.5)/0.05**2=380.25

Hence 381 programs would be a conservative estimate of the number of 
programs required.

Problem #3:

Assume that for other companies like yours 75% of programs are Y2K compliant. 
Use this fact in estimating the sample size.

Solution:

In this case it is reasonable to use 0.75 as an estimate for pi. The working 
is the same as #2 above:

	0.05=1.95*sqrt(0.75*0.25/n)
Rearranging terms:
	n=(1.95**2)*(0.75*0.25)/0.05**2=285.19

Hence 286 programs would be the number of programs required.