**IT-223 - Assignment #1**

All questions in this
assignment should be saved into a Microsoft Word document or any ‘doc’ or ‘RTF’
compatible file. PDF is also fine. This file will then be submitted to
COL.

This assignment is a
combination of theoretical questions along with a few 'numbers' questions. All
answers including graphs should be in a standard word processing document
(Microsoft Word, Google Document).You will submit this file as an attachment to
COL. The values in brackets are max points for each question.

**Graphs**: You will be asked to draw a few graphs for this
assignment. In order to submit, you will have to scan those graphs into your
Word document. If you don't have access to a scanner, you may need to run
out to Kinkos or similar to get it in. However, I expect that this will be the
only time in the course where you will need to scan anything. If you
absolutely can not get to a scanner (e.g. a DL student who lives in the middle
of nowhere), let me know and we can discuss other options. A clever technique
used by one student was simply to photograph his graphs using a digital
camera/phone etc and paste from their into the document.

**Question #1 (10): **American Airlines flight 91 from London to Chicago
O'Hare is **scheduled to
arrive at 7:50 PM**. Not surprisingly, several flights
arrive several (or many!) minutes early, and several flights arrive late. The
following flight times were recorded over a 6-day sequence (all times are
PM): 8:05, 7:49, 8:43, 7:50, 11:47, 7:31.

Give the 5-number summary and draw a box plot (use the modified box plot if there are any outliers).

On average (i.e. using the
mean) how many minutes early or late does this flight tend to arrive? Is
the mean an ideal statistic for determining the center of this distribution?
(Hint: Is there an outlier? How would you decide?) Show your calculations.

**Question #2 (9): **The following table gives the survival times in days of
several guinea pigs after they were injected with tubercle bacilli in a medical
experiment.

**41,99,103,103,105,107,111,113,114,117,119,598**

·
(3) Draw a histogram (pick what you think is an ideal
bin-size).

· (3) Then describe the distribution of survival times. Are there any outliers?

·
(3) Summarize the distribution by giving the five-number summary
and by drawing a modified box-plot.

**BONUS
VERSION - Worth up to 5 additional points: Use this dataset instead:**

**Question
#3 (6): **This is not a stats question… Read the article at the
top of the class web page called ‘Curve of Forgetting’. Summarize the article. Your summary does not
have to be long, but it does need to demonstrate that you read and understood
the article.

**Question
#4 (8): **The following dataset comes from a series of student
scores on a standardized exam: 687, 692, 681, 598, 789, 763, 990,
490. Calculate the mean and median.

**Question #5 (7): **This is not a statistics question, and is meant to be some easy points. Next lecture, we will begin using a very
powerful (and expensive!) statistical software package called SPSS. DePaul has a special license that allows us
to *remotely* use SPSS that has been
installed on DePaul machines. To do this, however, you need to set up remote
desktop. This in of itself is not very
difficult, but may take a little bit of playing around. All you have to do for this question, is go
through the steps and demonstrate that you have successfully connected to SPSS.
Begin by going to the ‘Resources’ page and, go through the steps needed to
start SPSS using remote desktop. Once SPSS has been started, paste in a
screenshot to show that you successfully
got it started. This will give you full points for the question. One way to get the screenshot into your
document is to press the ‘PrintScreen’
button on your keyboard, then open your Word document and press control-V.

**Submit your assignment to COL.
As will always be the case, it is due 10 minutes before class time.**