Outline of topics– 223 Midterm

Your best guide is the Powerpoint lectures and the quizzes. However, because it has been requested, I have created this guide to give you an outline of the topics that have been covered.

For this exam, I will provide all formulas along with a z-table.

You should bring a simple calculator. Be sure it has a square-root key.  You may NOT bring any kind of calculator if it is part of a data-device (eg iphone or other organizer). Graphing calculators are allowed.

·         Quantitative vs Categorical variables

·         Charting data: Which charts do you use for categorical data? Which for quantitative (aka nominal) data?

·         Charting data: When to use bar vs pie charts.

·         Deceptions or misleading information when using pie charts

·         Use of histograms.

o   Interpretation

o   Limtations/Misinterpretations of histograms

o   Difference between histogram and bar chart. For example, a histogram should not have spaces between the bars – unless there is 0 data for that particular bin.

o   Impact of skewed data on a histogram

·         Outliers: Identification.

·         Mean vs Median

·         The term “resistant” (as it applies to statistics)

·         Quartiles:

o   IQR

o   5-Number summary

o   Boxplot

o   1.5 rule for outliers

·         Standard Deviation

o   What the concept means

o    “Properties” of s

·         Density curves

o   Concept

o   Interpretation

o   How the z-score lets you compare “apples and oranges” (e.g. SAT scores and ACT scores)

o   Calculations involving areas/percentages/z-scores etc

o   Normal distribution: concept & interpretation, properties

·         Normal quantile plot

·         Scatterplots:

o   When are they used?

o   Explanatory vs response variables

o   Interpretation (eg what does a negative slope tell us?  Strength, Outliers)

o   Misinterpretations

o   Categorical data as the explanatory variable

·         Correlation coefficient including properties of r

·         Regression

o   What is a regression line? Why is it helpful?

o   What is the name of the method we have used to determine the best regression line?

o   Understand the y = b0 + b1*x formula

o   Know how to find and calculate b0 and b1

o   Correlation vs regression

o   Extrapolation – problems with

·         Coefficient of determination (R2) – what does it mean?

·         Residuals and residual plots

·         Effect of outliers and influentials points

·         Why is it important to always plot data? (see example in regression lecture)

·         Lurking/Confounding variables

·         Difference between association and causation

·         Anecdotal data

·         Difference between Population vs sample

·         Counfounding variable

·          “Controls” in experimentation

·         Placebos

·         Biases: review and understand the examples. Review ways to avoid bias

·         Double-blind experimentation

·         Randomization (not how to do it, but why it is important)

·         Stratification