Operationalization -- Levels of Measurement
Now, we are going to discuss how we actually go about measuring things when doing quantitative research.
This is a very important topic: It will come up time and time again.
I Levels of Measure
Remember when doing quantitative research we are trying to measure (or assign numerical values) to the variables that we are studying.
For example, if we were studying conflict in close relationships. One of the questions we might try to ask is how often do people actually fight with each other. In other words, we would have to find a way to measure how often people argue.
Or suppose we wanted to find out what types of friends fight the most often. We would then have to find a way of measuring different types of friends, and then see who fights the most. Maybe it is best friends that fight the most, acquaintances, etc.
So, as you can see, we need to be able to measure a variety of different types of things when doing quantitative research, and there are basically four different types of measures that we can use when doing quantitative research.
A Nominal Measures
Measuring a variable by classifying each observation into a distinct category.
simply placing objects, people, or events into different categories.
nominal = "naming"
Now I know that definition may not make too much sense, but lets go through some examples.
Examples:
Variable | Attributes |
Sex | female, male |
Type of Friendship | best friends, close friends, acquaintances |
Religious Affiliation | Catholic, Muslim, Protestant, Buddhist, etc., |
It is simply placing people, objects, or events into different categories.
Sometimes called categorical data.
There are some special properties of nominal measures:
1. exhaustive (you must have enough categories
so that everyone or everything being measured fits somewhere).
2. mutual exclusivity (things being measured should only fit into
one category)
You should not be able to place the same item into two different categories.
Nominal scales are used all of the time in communication research. In many different ways.
Let's go over a good example:
Signorielli, McLeod, Healy
Commercials on MTV
Took nominal measures of each person in the ad.
Measured them by Sex (male or female)
And other measures
Sex |
|||
Males | Females | ||
Out of Shape | 33 | 22 | |
Body Type | Average | 191 | 79 |
Very Fit | 37 | 127 | |
Unattractive | 13 | 19 | |
Attractiveness | Neither | 164 | 36 |
Attractive | 96 | 53 | |
Very attractive | 6 | 130 | |
Neutral | 257 | 110 | |
Skimpy/Sexy Clothing | Somewhat sexy | 18 | 58 |
Very sexy | 0 | 70 |
B Ordinal Measures
1. Same as nominal measures.
2. But there is a rank order to categories.
Ordinal measures are somewhat similar to nominal measures in that they involve placing objects, people, or events into different categories. However, ordinal measures are unique in that categories are placed in a special order or ranking system.
In other words, the categories are ordered in a particular manner. (more or less than relationship among the categories).
However, ordinal measures do not specify how much distance is between the categories.
Think of an ordinal scale sort of like the results of a horse race. You know the order in which the horses finished. That is, you know the 1st place winner beat the 2nd place winner, but you don't know by how much. It could have been by a couple of inches or by half a lap.
For example:
Formal Education | 1. high school diploma |
2. college degree | |
3. graduate degree |
College Basketball Rankings
or
Responses to the following Questions:
Rank the following TV shows according to how much you like them:
___ Melrose Place
___ Seinfeld
___ Friends
Compared with the romantic relationships that your friends have, would you say that your romantic relationship is:
( ) Excellent
( ) Good
( ) Fair
( ) Poor
There are some special properties of ordinal measures:
1. exhaustive
2. mutual exclusivity
3. categories are rank ordered* (distance not known or equal)
C Interval Measures
1. Same as Ordinal measures.
2. Distance between categories is equal.
Interval measures have equal and known distances between ranked items with an arbitrary zero point.
Equal and known distance between ranked items. So, like ordinal measure, the items are placed into an order of greater or less than. However, in addition to this, the distance between each category or item is the same. That is, the distance between the first and second category is the same as the distance between the second and third category, and so on.
For example:
Temperature: | Celsius and Fahrenheit scales. |
Intelligence | IQ scores |
Some other examples:
How often do you fight with your girl/boyfriend? | |||||||||
Never | 1 | 2 | 3 | 4 | 5 | 6 | 7 | Always | |
This is a Likert type scale. |
Interval scales are unique in that zero on the scale does not really mean that nothing exists at that point. In other words, zero is arbitrary, in that it does not really refer to the fact that the thing you are measuring does not exist. For example, zero temperature on the Celsius and Fahrenheit scales does not mean that temperature does not exist at zero.
The same can be said about intelligence. A zero IQ does not mean that a person has absolutely no intelligence.
Think about SAT scores. They are the interval in nature.
There are some special properties of interval measures:
1. exhaustive
2. mutual exclusivity
3. categories are rank ordered
4. equal and known distance between categories*
The last measure is a ratio measure
D Ratio Measures
1. Same as interval
2. But, now zero actually means that nothing exists.
Now a ratio measure is simply everything a interval measure is, except zero actually refers to zero.
That is, zero is not arbitrary, but it actually means that what you are measuring does not exist. Because zero actually refers to zero, when numbers are doubled that actually means you have twice as much as you did before. For example, if you make twice as much money as someone else, then you truly have twice as much money.
However, with an interval scale this is not the case (because zero is not arbitrary). For example, if the weather goes from forty degrees to eighty degrees it does not mean that it is twice as hot.
Examples:
Temperature: | Kelvin scale |
Age | in years |
Weight | in pounds |
Income | in dollars |
There are some special properties of ratio measures:
1. exhaustive
2. mutual exclusivity
3. categories are rank ordered
4. equal and known distances between categories
5. non arbitrary zero* (zero really means that measured item does
not exist.)
One of the best ways to think about the measures are to think about the information provided by each measure:
II Differences among the measures?
A. Provide Different types of information
Information Provided |
||||
Nominal | Ordinal | Interval | Ratio | |
Exhaustiveness & mutual exclusivity | Yes | Yes | Yes | Yes |
Rank Order | Yes | Yes | Yes | |
Equal Intervals | Yes | Yes | ||
Nonarbitrary Zero | Yes |
Why do you think these differences matter?
B. Differences determine types of statistics that we can perform
We are limited to the statistics that we can calculate based upon how we have measured a variable.
We can only calculate a limited range of statistics when we measure variables with nominal or ordinal level measures.
However, we can perform a wide range of statistics if we measure variables using interval or ratio level measures. The reason for this is that when we measure variables using scales that have equally distant units between the measures, then we can perform all sorts of mathematical operations on the data.
For instance, If we use interval or ratio level data, we can add, subtract, multiply and divide because we are working with units that are equal to each other. We can't do the same when the units between categories are not equal lengths.
C What scale should use when measuring variables?
1. We should use the most appropriate and informative measure.
a. By appropriate we should not use a measure that exceeds the information represented by the variable itself.
For instance, since Religious Affiliation is a variable that only contains categories (Protestant, Catholic, Muslim) with no order among the categories, then we should only use a nominal measure in this case.
b. By informative we should try to use the most informative measure that we can.
For instance, I could measure income in many different ways.
I could use an ordinal scale (poor, middle class, wealthy)
Or I could use an ratio level measure (in terms of dollars actually earned).
It is to my advantage to collect the data in ratio rather than ordinal form, because I can perform many more statistics when it is collected that way. For instance, I can calculate an average income when data is collected using a ratio scale. If I had used a ordinal scale, I would not have been able to calculate the average income.
So, we should always try to use the most informative while not exceeding the boundaries of what is appropriate.
Main Point:
There are four different types of measures that can be used to collect data regarding variables.
As we move from Nominal to Ratio level measures we gain more information.
How we measure a variable is a very important consideration because it determines the types of analysis we can perform.
When talking about operationalization, we also need to talk about reliability and validity.
III Issues of Reliability and Validity.
A Reliability
In general, reliability, simply means does our measure produce the same results every time we use it to measure the same item. Or in other words, how dependable is the measure we are using.
There are several ways to asses how reliable a measure is. Three main ways I want you to know.
1. Test/retest method.
It simply involves measuring the same person or unit on two separate occasions. If the measure has reliability, you will get similar results each time.
2. Split-half method
This involves dividing what you are measuring into two groups, typically using a random process and then measuring the items, units, or people in each group. If the measure is reliable then both groups produce similar results.
3. Inter-coder and Intra-coder reliability
Both of these techniques are typically used when doing content analysis. That is, trying to describe the content of a message somehow.
Inter-coder reliability- is the degree to which measurement is consistent between people.
For example, in Santa Barbara, when they are coding TV shows for violence. Different people are assigned to code the same show, and if they don't produce the similar observations then the measure is not reliable. You simply compare one person's measure with another.
Intra-coder reliability- is the degree to which measurement is consistent within a person. Sort of like test-retest method. In that person is asked to code a message and than do it again. Results are compared, if answers are similar, then the measure is reliable.
B Validity
How accurate or true is our measurement? Is it really measuring what we think it is.
Specifically, the term validity it refers to how well the conceptual and operational definitions mesh with each other.
The concept of validity is simple but the determination of validity is very elusive.
Validity cannot be assessed directly. The reason we can never achieve absolute validity is that constructs are abstract ideas. Through the collection of evidence over time, a case is built for the validity of measures.
There are two main approaches to validity. One is a subjective evaluation of an operational definition. The other is an approach that is more objective, looking for evidence that is "external" to the investigator.
Subjective Assessment
1. Face Validity
The easiest type of validity to achieve and the most basic kind. In other words, it addresses the question:
"On the face of it, do people believe that the definition and method of measurement fit?" It is a "consensus" method of measurement validity by the scientific community. Face validity is evaluated by a group of judges/experts, who read or look at a measuring technique and decide whether in their opinion it measures what its name suggests. On some level, every instrument must pass the face validity test either formally or informally.
2. Content Validity
It captures the entire meaning. It is concerned with the extent to which a measure adequately represents all facets of a concept.
To demonstrate content validity, one must be able to define clearly and identify the components of the total domain and then show that the observations being made adequately represent these components.
For example, if I give you a test on your understanding of the material presented in the course, I should be concerned that the test has content validity--it includes questions on all aspects of the course.
3 Criterion Validity (more objective assessments)
It uses some standard or criterion that is known to indicate a construct accurately. The validity of an indicator is verified by comparing it with another measure of the same construct in which a researcher has confidence. There are two subtypes of this kind of validity.
a. Concurrent Validity (compare it with other measures)
Agrees with a pre-existing measure compare it with a similar existing measure.
For example, if we create a new concept called "comprehension skills", we might expect it to correlate with a student's GRE-Verbal score which generally reflects comprehension skills.
This means that most people who score high on the old measure should also score high on the new one and vice versa.
The two measure may not be perfectly associated, but if they measure the same or a similar construct, it is logical for them to yield similar results.
b Predictive Validity
Comparison with future behavior.
An indicator predicts future events that are logically related to a construct.
For example, if the GRE exam has high predictive validity, then students who get high scores will subsequently do well in college. If students with high scores perform the same as students with average scores then the GRE has low predictive validity.