Statistics Used with Content Analysis
I Chi-square Analysis
Goal To determine whether the results observed are due to chance or not.
We are simply trying to determine if the results we obtained are simply due to chance or is there a pattern to them.
Chi-square tests won't tell us why the patterns occurred when analyzing a content analysis, but it will tell us whether or not the results we got were due to chance.
In short, chi-square determines whether our observations are falling into categories due to chance or not.
In other words, is there something unusual about how our observations are falling into categories.
1. Chi-square tests should be used with nominal or ordinal data.
Trying to determine if observations fall into categories in an unusual pattern or just a chance occurrence.
2. There are two kinds of Chi-square tests.
a. One sample chi-square
Used when only one variable is involved (with two or more attributes - levels)
b. Multiple sample chi-square
Used when there are two variables being compared (each with two or more attributes or levels)
B How tests work:
Overall:
Simply compare what the observations would be like if they were truly random (theoretical frequencies) with the observations that were actually obtained (observed frequencies).
Calculate the difference between the theoretical and observed frequencies.
If we get an unusual difference (happens less than 5 times out of a hundred) then we say that the observations obtained did not occur by chance, but represent some sort of pattern.
Let's work through an example of how we actually do this.
1 One sample
a. Write out null and research hypothesis.
Null Hypothesis:
There should be an equal number observations across the various levels or categories of the variable in question.
Research Hypothesis:
There won't be an equal number of observations across the categories.
b. Calculate what should happen is there is no pattern to the observations.
Calculate a theoretical value - what the observations should be if there were no pattern to them. What value or observations should be if the null hypothesis is true.
Theoretical value (T) = Total number of observations divided by the number of categories
c. Compare this theoretical value with the actual values (observations or O)
This is done by calculating a chi-square.
X2 (chi square) = sum of (O-T)2 /T
O = observed frequencies
T = theoretical frequencies
What this formula does, is calculate the distance from the theoretical values and the actual observations. The larger the chi-square number, that means there was a greater discrepancy between what should have happened if everything was equal and what actually happened.
In other words, the more different my observed frequencies are from the theoretical frequencies, then the greater my chi-square value.
In short, the greater the chi-square number, the further my actual observations are from what would happen if the observations were simply put into categories at random.
d. Determine if the Chi-square value obtained is unusual (happens less than 5 times out of hundred).
Look at distribution of chi-square critical values and see if the value obtained is unusual.
To do this, need to look at the correct distribution, thus, need to calculate degrees of freedom.
df = (k-1) (where k is the number of categories)
e. if observed chi-square value is greater than critical value (listed in table) then reject the null hypothesis.
Assume that there is a pattern to the data.
Things did not fall into groups in equal numbers. Observations are not due to chance.
2 Multiple Sample
Sometimes we look for patterns among two variables simultaneously.
Multiple sample is different in that it addresses whether or not the observations are proportional across the variables being examined.
a. Steps and formula are the same as a single sample chi-square.
b. Only thing different is how theoretical value and degrees of freedom are calculated.
Must calculate a separate theoretical frequency for each cell.
T = (Row sum X Column sum)/Grand sum
df = (R-1) x (C-1)
where:
R = number of categories in each row
C = number of categories in each column.
Everything else is the same.
Summary
Chi-squares simply compare what should have happened if the observations were equal to each other (no pattern), with what was actually observed.
If the difference between the theoretical and actual observations is unusual then we reject the idea that the observations occurred at random. There was a pattern to the data.