Fisher’s Exact test

This tutorial explains how to conduct Fisher’s Exact Test in R. Fisher’s Exact Test is used to test for differences in the frequencies of categorical variables between groups in cases where the data do not meet the assumptions of the more common Chi Square test. Specifically, Fisher’s Exact Test is used when there is more than one expected value below 5. Here is a brief example of how to run this test in base R. Note that Fisher’s Exact Test is commonly applied when the data do not meet the assumptions of the more common Chi Square test, specifically, when there is more than one expected value below 5.

In the example, we will test whether the frequencies of dorsal fin rays differ significantly between three populations of the tetra species Pseudochalceus lineatus in western Ecuador. The counts are in a data file called 1_DF.txt. The data are formatted as follows:

8_9 10_11

5 14

3 16

1 7

Where 8_9 and 10_11 are the column headers (variable names) indicating that fish with 8 rays and 9 rays have been pooled in a single variable, as have fish with 10 rays and 11 rays (10_11). The categories were pooled because there were very few fish with 8 or 11 rays. The next three rows are the data and indicate the number of fish in each cell (category) for each population. So, in the first population, there are 5 fish with either 8 or 9 dorsal fin rays and 14 fish with either 10 or 11 dorsal fin rays. And so on for population 2 and 3.

Let’s begin by setting up the work space. The example below is for my computer. You will have to modify this code to indicate where you want to work on your computer.

setwd("C:/1awinz/R_work/pseudochalceus")
data=read.table("1_DF.txt", header=T)
attach(data)
names(data)
## [1] "X8_9"   "X10_11"

Running this code chunk shows that there are two columns or variables, named X8_9 and X10_11, as explained above.

Now let’s get the expected frequencies to see if cells with expected values <5 are present. The more common Chi Square test assumes that no more than one cell of the expected values table has a frequency of less than 5.

chisq.test(data)$expected
## Warning in chisq.test(data): Chi-squared approximation may be incorrect
##          X8_9    X10_11
## [1,] 3.717391 15.282609
## [2,] 3.717391 15.282609
## [3,] 1.565217  6.434783

We can see that three cells have expected values less than 5, so we are correct to use Fisher’s Exact Test.

Let’s run the test. It comes with base R so you do not need to install or open any packages. Run the test with the following code:

fisher.test(data)
## 
##  Fisher's Exact Test for Count Data
## 
## data:  data
## p-value = 0.7
## alternative hypothesis: two.sided

The test indicates that P=0.7, where P is the probability of getting a result of this magnitude under the null hypothesis of no differences, so we can conclude that there is not a significant difference in the frequencies of dorsal fin rays between the three populations.