SPSS - Quick Primer

While there are various SPSS help resources that will be made available to you (eg Steve Jost's, Prof Sabatini's, etc), this file is intended as a very quick primer on the basic things you'll need to do to get started using SPSS.

Accessing/copying files between the local machine and the remote machine: Recall that you will want to somehow copy all of the files from the book's CD onto the remote computer. Review the steps discussed in lecture on how to do this. Vert briefly, it involves changing the settings when you start Remote Desktop. Be sure that the 'Options' button is selected so that you can see all of the tabs. Choose: Local Resources tab >> under 'Local devices and resources' click on 'More' >> click on the plus sign next to 'Drives' >> select the drive(s) you want to be able to access from the remote machine. Note that these drives will show up only when you navigate up to the folder/directory level that shows all of the drives on your computer. 

Opening files from your textbook in SPSSYour textbook uses files with the extension 'POR': However, SPSS defaults to looking for files with an extension of 'sav'.  What this means is that after clicking on File >> Open, you will need to change the 'Files of Type' option away from 'sav' to 'por'.  For this problem, and any subsequent problem that asks you to draw a graph, be sure to paste your graph into your Word document.

Random Tip: If you get tired of having to always choose 'POR' file, you can try the following: Rename all of the files from their POR extension to SAV. The easiest way to do this is from the command line (for those of you wh know how to use it).  As far as I can tell, this does not harm the data.

 

PASTING CHARTS FROM SPSS INTO A DOCUMENT:

Sometimes you will find that SPSS does not properly allow you to copy/paste. For example, you may find that it only pastes half of the chart.  Here is a method that will give you a nice image of any chart you wish to paste:

Right click the chart  ---  choose EXPORT --- Under ‘DOCUMENT type’ choose ‘Graphics Only’ --- Under ‘Graphics Type’ choose JPG (probably your best bet) --- Choose a location and save.

(Remember that this file is saved on the remote machine unless you specify your local computer).

 

If you are using a Mac:Some people have had problems pasting from Macs – I have no idea why. However, one student pointed out the following fix:

“I have a macbook pro and what I do is select Command/Shift/4 and you'll get a crosshair symbol, you then press down on your touchpad and drag it around your graph. When you let go it will save the screen shot to your desktop. Then in Word chose Insert > Picture from file and TA DA!  https://support.apple.com/en-us/HT201361

 

 

Scatterplot:

1.    From the 'Data Editor' view, choose:  Graphs >> Legacy Dialogs >> Scatter/Dot

2.    Choose Simple Scatter, and click 'Define'

3.    Remember to put your explanatory variable on the x-axis and your response variable on the y-axis

4.    Click OK


To generate a line:

1.    In the 'Statistics Viewer' view, double click the graph

2.    This will bring up the 'Chart Editor' view.

3.    Look for the little icon of a scatterplot with a line through it. You'll know you're on the correct icon if, when you hover over it for a second, you see the label 'Add Fit Line to Total'

4.    In the 'Fit Line' tab, choose 'Linear' (since you are trying to see if there may be a linear relationship), and click Close.

5.    You can close the 'Chart Editor' view. You will now see a line in the 'Output' view.


It probably wouldn't hurt to go through this process 2-3 times to get comfortable since you'll be doing it so much.  I'd begin all the way at the beginning. In other words, re-open a data file and go through the whole thing again until you're comfortable opening files and creating a scatter plot.  Other graphs have their own sequence of commands, but are often similar to scatterplots.

Correlation ('r'):  Analyze >> Correlate >> Bivariate.  Choose your variables, and click on OK.
The Pearson Correlation is your value for ‘r’.

Drawing a histogram:  Graphs >> Legacy Dialogs >> Histogram à select your field.   

Plotting a curve over a histogram:  Click on the histogram to open the ‘Chart Editor’.  Click the icon that shows a curve (hovering will display the label ‘Show Distribution Curve’)

Normal Quantile Plot:  Recall that plotting two quantitative variables using this plot can give you an idea whether the data follows a Normal distribution. To plot in SPSS:  Analyze >> Desc Stats >> P-P Plot. Simply choose your variable and click ‘OK’.  (Note: Sometimes instead of P-P plot, you will see ‘Q-Q Plot’. They are not completely identical, but for now, you may treat them as the same).


Boxplots:  Graphs >> Legacy >> Boxplots.  Choose ‘Summaries of SEPARATE variables’
. Select the variable you are interested in, and move it to the dialog box labeled ‘Boxes Represent’.

Finding descriptive statistics such as mean, standard deviation etc: 

   Analyze >> Descriptive Statistics >> Descriptives (for things like mean, SD). 

You can select which stats you want by choosing ‘Options’.  You can get even more information such as Medians by:  Analyze >> Descriptive Statistics >> Explore.

Descriptives including quartile scores

   Analyze >> Descriptive Statistics >> Frequencies. 

Select the column.  Click the ‘Statistics’ button and choose all information you want to see (e.g. quartiles).

 

Listing z-scores for all datapoints in a series:

Begin determining the descriptive statistics (as discussed earlier: Analyze >> Descriptive Statistics >>Descriptives).  Select the column for which you wish to find the z-scores.  Then check ‘Save standardized values as variables’ at the bottom. This will create an additional row in your SPSS table showing the Z score for every datapoint in the column you selected.

 

 

Regression Analysis:

   Analyze >> Regression >> Linear

 

Choose your explanatory and response variables.  Note that SPSS still uses the labels ‘dependent’ and ‘independent’.  Recall that we are avoiding these labels in favor of explanatory (independent) and response (dependent). 

 

Note: You will typically want to include a residual plot as part of this process. Refer to the lecture notes for an explanation of why we would do so.  Therefore, before clicking on ‘Ok’, first click on Save.  Then under ‘Residuals’, select ‘Standardized’ and click ‘Continue’.  Then click ‘OK’.  This step will add an additional column to your dataset. That column will be called something like 'Standardized Residual' or perhaps 'ZRE'. For more details, see the section called 'Residuals' below.

 

Interpreting the regression analysis tables:

You will see a few tables. The ‘Model Summary’ table gives you r and r2.   If you want to include the regression line, you use the technique to generate the line as described above in the ‘Scatterplot’ example. 

 

Look for the table labeled ‘Coefficients’.  In the coefficients table, the information for your regression line is under the column labeled ‘B’.  The B value labeled ‘Constant’ (183.08 in this example) is the y-intercept, i.e. B0. The lower value (-0.184 in this example) is the slope.

 

Coefficients

Model

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

B

Std. Error

Beta

1

(Constant)

183.080

53.550

 

3.419

.014

 

-.184

.366

-.201

-.503

.633

 

 

Residuals:

Recall from above, that doing a residual plot is an important step in a linear regression analysis.  The reason is that if a plot of the residuals around the 0 line is not random (ie. if there is some pattern evident), then you may assume that your regression line is not linear. Therefore, doing a linear regression analysis will lead to bad results.

 

Recall that in the ‘Regression Analysis’ process above we created an additional column of ‘residuals’.  At this point, you can create a graph, but instead of plotting the response variable on the y-axis, you would plot the residual:   Graph >> Legacy >> Scatterplot, choose ‘Simple Scatter’ and click Define à Choose the residual column as your response variable.

 

You will see a plot in which the y-axis has the value 0 in the middle. Your goal is to see if the data points are randomly spread above and below this line.

 

To make it easier, be sure to draw a horizontal line along this ‘0’ data point.  (You will notice that the residual line does not give you the 0-axis line that we have seen in our examples in lecture). To get this line, double-click the graph, and look for the little icon that shows a straight horizontal line. (Hovering will display: “Add a reference line to the y axis”). When the ‘Properties’ window appears, you can type 0 under ‘Position’ assuming that is where you want the line. Click ‘Close’. 


Stemplot: 

   Analyze >> Descriptive Statistics >> Explore

Choose the variable you want to plot. Click on 'Plots' >> check Stem and Leaf. Click Continue, click OK.

Pivot Tables

   Analyze >> Descriptive Statistics >> Frequencies

Choose the two columns you want to create a pivot table out of.  Be sure the ‘Display Frequencies Table’ checkbox at the bottom is selected.

 

Crosstab Tables:

    Analyze >> Descriptive Statistics >> Crosstabs

Choose the two columns you want to create a crosstab table out of.