The Chi Square TestI am going to talk about a chi square example in this document that corresponds to the example
you saw in the Descriptive Statistics Crash Course (#1), t-Test Crash Course (#2), and ANOVA
Crash Course (#3) where participants were asked to recall how much money they spent on
textbooks the prior semester. However, for this Chi Square crash course, we are going to focus
on nominal variables (categorical) rather than scaled variables (like ratio or interval). The good
news is that this mini-lecture will sum up the basics of the chi square for you as we look at this
study, but you can find additional information about the chi square in your textbooks. On the
final page are several questions based on this crash course. Answer these questions and go into
your “Crash Course in Statistics – The Chi Square Quiz #4” in your Canvas assessments menu
and copy over your answer. Each Crash Course Quiz counts 5 points.
How, when, and why do a Chi Square?
Before we get to the example, let me give you some basic information about the chi square.
The chi square is used to compare two or more levels of a categorical (nominal or ordinal)
variable. I recommend looking at the crash course on descriptive statistics for more info on
interval and ratio scales, but I want to highlight nominal and ordinal scales here.
Nominal scales are based on assigning items to categories. For example, you can have
“yes” versus “no” categories, or “male” versus “female” categories, or “Honda” versus
“Toyota” versus “Subaru” versus “Ford”. For nominal scales, are just different
categories, but one option isn’t “better” or “higher” than another (After all, which is
higher: males or females?).
Ordinal scales have more order to them. That is, they are ranked. Thus there might be a
better or worse ranking here (Pizza is ranked highest, Salad second highest, Sandwiches
third highest, Liver lowest for food preference). We might know the order, but we may
not know how spread out those preferences are. That is, maybe pizza, salad, and
sandwiches are all ranked very high in preference but liver is ranked really, really low!
Or think about a race. The first, second, and third place finishers may come in a few
seconds apart while the fourth-place finisher is over a minute behind.
For this crash course, I want to focus only on the nominal scale. A chi square essentially
looks at the percentage of cases that fall into categories of the variable. Let’s say I look at
gender as a variable in my study. I sample 200 people at random. There is a good chance I
would get 100 men and 100 women, but I could also be off a little from that 50/50 ratio. That
is, I could wind up with 95 men and 105 women just based on natural fluctuations of my data
collection and selection procedure. The question becomes, “Are the observed differences
between the number of men and women I observe significantly different than the data that I
expected to see?” As another example, I might look to see if there are more guilty than notguilty verdicts in a trial. If I poll jurors and find that 60% found guilt while 40% found no
guilt, I might want to run a chi square to see if the difference is based on chance factors or
something more.
Because we are looking at categorical variables (men versus women, or guilt versus no guilt),
it is not appropriate to look at means and standard deviations. After all, what is a mean
gender? No such thing, right? Thus, we cannot use a t-Test or ANOVA for this analysis, as
those tests are based on mean scores. The chi square, however, is designed to look at
percentages and frequencies of data. There are two different types of chi squares:
1). Chi Square – Goodness of Fit: In this test, we look at only one variable. Thus, we can
determine whether we have more men in our study than we should by chance, or we can
see if there are more not guilty verdicts than we should have by chance. Although I will
briefly discuss this Goodness of Fit test in the lecture material, we will focus mostly on
the second kind of chi square test in this crash course document …
2). Chi Square – Test of Independence: For this test, we want to see if two variables are
independent. Consider juror gender and verdict in the same chi square model. We might
want to know if juror gender has an impact on verdict or whether verdict is independent
of gender. BOTH variables here are nominal in nature (two levels to gender: male versus
female – two levels to verdict: guilty or not guilty).
The nice thing about chi squares is that there really are no assumptions about the shape of the
distribution (we don’t need a bell-shaped curve like we do for t-Tests and ANOVAs). But the
variables can’t be on scales – just categories. Let’s run a chi square and interpret it in SPSS.
Textbook Study – How Much Did You Spend On Textbooks (High or Low Conditions)
Recall the basic set-up for our money spent on textbooks. Researchers ask participants to
recall how much they spent on textbooks the prior semester, and has each participant write
their answer on a survey sheet. In all conditions, the first ten answer slots are already filled
in, presumably by other respondents. However, the researcher actually completed those ten
slots, and manipulated the dollar amounts so that in in the High Dollar Condition, the dollar
amounts ranged from $350 to $450 (see Figure 1). In the Low Dollar Condition, amounts
ranged from $250 to $350 (see Figure 2). For right now, we are going to briefly omit the
“Control” condition (we’ll get back to this third level later in this document) and focus on the
High Dollar Condition and the Low Dollar Condition only.
Imagine we have eight real participants in the High Dollar Condition and eight real
participants in the Low Dollar Condition (and no, we are not including the original dollar
amounts on the sheet passed out by the researcher, as those are not real participants!).
However, instead of asking participants to recall how much they spent on textbooks, we ask
them to look at the other responses on the survey and decide if other respondents spent more
or less than average. We thus have a nominal dependent variable with two answer options:
“higher than average” versus a “lower than average”. See Figure 1 for the High Dollar
Condition survey and Figure 2 for the Low Dollar Condition survey.
Consider the data below, noting that I have a dichotomized dependent variable. I will label
“higher” as 1 and “lower” as 2 (though I could just as easily have labeled “higher” as 2 and
“lower” as 1 – for nominal variables, the exact value does not matter, just that different
numbers represent different conditions).
Condition A (High Dollar Condition)
1 (more than average)
1 (more than average)
1 (more than average)
1 (more than average)
1 (more than average)
1 (more than average)
1 (more than average)
2 (less than average)
Condition B (Low Dollar Condition)
1 (more than average)
1 (more than average)
1 (more than average)
2 (less than average)
2 (less than average)
2 (less than average)
2 (less than average)
2 (less than average)
Eyeballing this, it looks like participants selected the “Others spent higher than average”
option (#1 option) response more frequently in the High Dollar Condition while participants
selected the “Others spent lower than average” option (#2 option) more frequently in the Low
Dollar Condition. But is this difference based simply on chance or something else? If it is
based on chance, then I would say that both columns are pretty equal, and thus the High
versus Low Dollar manipulation had no impact. If it is based on something other than chance
(i.e. it is significant), then I can conclude that there is something else at work here (probably
the High versus Low dollar amounts from prior participants!). That is, we can answer the
question “Did our manipulation work, with those in the High Dollar Condition saying that
the prior respondents spent a higher amount of money than average on books while those in
the Low Dollar Condition spent a lower amount of money than average on books?”
Let’s see how to run this in SPSS. Since this example involves two different variables with
categorical levels, we will run a chi square test of independence. For the next section, I am
going to open SPSS and run a chi square test of independence. I’ll use screenshots from
SPSS as I go, but feel free to run these analyses yourself. Just set up your SPSS file like mine
(I also included this SPSS file for you in Canvas if you prefer to use that. It is called “Crash
Course Quiz #4– Textbook Money (Chi Square Practice)”, but it is a short data set so I
recommend setting up your own SPSS file using the values from the table above). I am just
going to give you the basics here, but you can refer to other sources to figure out some of the
info we get from the chi square not covered in this lecture).
SPSS – Our Textbook Money Study
1. Click Analyze > Descriptive Statistics > Crosstabs… on the top menu as shown below.
You will be presented with the following (though note that I changed the “Did you spend more”
variable into a nominal variable, as denoted by the
symbol. I also chose to display the
dependent variable name as “HigherLower” rather than the mouthful phrase “Did others spend
higher or lower than average”):
Transfer one of the variables into the “Row(s):” box and the other variable into the
“Column(s):” box. In our example we will transfer the “DollarCondition” variable (our
independent variable) into the “Row(s):” box and “HigherLower” into the “Column(s):”
box (our dependent variable). There are two ways to do this. You can highlight the
variable with your mouse and then use the
button to transfer the variables or you can
drag-and-drop the variables. How do you know which variable goes in the row box and
which goes in the column box? There is no right or wrong way. It will depend on how
you want to present your data, so feel free to try it either way.
If you want to display clustered bar charts, then make sure that “Display clustered bar
charts” checkbox is ticked. I usually don’t need the chart, but if you want a visual aide
about what the data looks like, the bar chart might help.
You will end up with a screen similar to the one below:
1. Click on the
as shown below:
Click the
button. Select the “Chi-square” and “Phi and Cramer’s V” options
button.
2. Click the
button. Select “Observed” from the “Counts” area and “Row” from
the “Percentages” area as shown below:
1. After clicking “Continue”, click the
button to generate your output.
Output of the Independent Chi Square in SPSS
You will see several tables for the chi square, but ignore the case processing table.
Condition (1 = High, 2 = Low) * Money Spent (1 = Higher, 2 = Lower) Crosstabulation
The nice thing about this table is it tells you exactly what you ran in the name of the table!
As you can see, those in the High Dollar Condition chose the “Higher (than average)” option
most frequently while those in the Low Dollar Condition chose the “Lower (than average)”
option most frequently. But is this statistically significant? For that, we look to the next table.
Chi-Square Tests
Focus on the Pearson Chi-Square row (and ignore the others). Based on Pearson, we can see that
the chi square is significant, χ2(1) = 4.27, p = .039. Keep in mind that our degree of freedom
here is 1. We calculate the df by looking at the formula (k1 – 1) X (k2 – 1), where k refers to each
variable. That is, k1 focuses on the two levels for Dollar Condition (which has 2 levels, High vs
Low) and k2 focuses on the two levels for money spent (which also has 2 levels, Higher than
average and Lower than average. Plug in those number of levels into our (k1 – 1) X (k2 – 1)
formula, and we get (2 – 1) X (2 – 1), or 1 X 1 = 1.
Symmetric Measures
Our final table looks at symmetric measures. As you can see below, both Phi and Cramer’s V are
very high (.516 on a 0 to +1 scale, which makes .516 pretty high). Both are significant at p =
.039. However, we use the “Phi” row for designs like the one we just described (a 2 X 2 study
design). Thus we would focus on that Phi row to assess our 2 X 2 design. Use Cramer’s V for all
other chi squares (like a 2 X 3 design, or a 3 X 3 design).
Interpreting the Independent Chi Square in SPSS
If a significant result occurs, the write up looks like this:
A chi square test of independence was calculated comparing how much participants in the
High Dollar Condition versus Low Dollar Condition thought others spent on average for
books (higher than average versus lower than average). A significant relationship emerged,
χ2(1) = 4.27, p = .039. Most participants in the High Dollar Condition (87.5%) thought
other respondents spent more money than average on books while most participants in the
Low Dollar Condition (62%) thought other respondents spent less money than average on
books. Phi showed a large effect. This indicates that the High versus Low Dollar
manipulation worked as intended.
If non-significant, the write up for the chi square is even easier. You simply write:
A chi square test of independence was calculated comparing how much participants in the
High Dollar Condition versus participants in the Low Dollar Condition thought others
spent on average for books (higher than average versus lower than average). No significant
relationship was found, χ2(1) = 1.27, p = .351, and phi did not show a large effect. This
indicates that the High versus Low Dollar manipulation did not work, as the frequency of
answers were equally distributed among cells.
Another quick example – Three levels to your main independent variable.
Now, assume that we still had two levels to our independent variable of Dollar Condition: High,
and Low. However, we add another level to our dependent variable, with the available options
including “Higher than average”, “Lower than average”, and a new “Average” option. Consider
our SPSS output (next page). For the data on the next page, we have the following interpretation:
A chi square test of independence was calculated comparing how much participants in the
High Dollar Condition versus Low Dollar Condition thought others spent on average for
books (higher than average vs. lower than average vs. average). A significant relationship
emerged, χ2(2) = 8.57, p = .014. Most participants in the High Dollar Condition (75%)
thought other respondents spent more money than average on books while most
participants in the Low Dollar Condition (62.5%) thought other respondents spent less
money than average on books. Cramer’s V showed a large effect. This indicates that the
High versus Low Dollar manipulation worked as intended.
Note that our df changed a bit. We now have three conditions in k2, so (k1 – 1) X (k2 – 1) gives us
(2 – 1) X (3 – 1), or 1 X 2 = 2. Also, note that the write up uses Cramer’s V rather than Phi (Phi
is best used for a 2 X 2 table, but here we have a 2 X 3 table)
A final example – Three levels to your main independent variable and three levels to your
independent variable
Finally, assume that we have three levels to our independent variable of Dollar Condition: High,
Low, and Average. We also have three levels to our dependent variable, with the options
including “Higher than average”, “Lower than average”, and “Just about average”. See Figure 1
for an example of the High Dollar Condition with three dependent variable answer options.
Figure 1: High Dollar Condition with Three Answer Options
If this 3 X 3 test is significant, we would conclude the following (based on the SPSS output
below)
A chi square test of independence was calculated comparing how much participants in the
High Dollar Condition versus Low Dollar Condition versus Average Dollar Condition
thought others spent on average for books (higher than average vs. lower than average vs.
average). A significant relationship emerged, χ2(4) = 19.79, p = .001. Most participants in
the High Dollar Condition (75%) thought other respondents spent more money than
average on books, most participants in the Low Dollar Condition (62.5%) thought other
respondents spent less money than average on books, and most participants in the Average
Dollar Condition (75%) thought other respondents spent an average amount on books.
Cramer’s V showed a large effect. This indicates that the High versus Low versus Average
Dollar manipulation worked as intended.
Notice that our df is now 4. Using (k1 – 1) X (k2 – 1), we now have (3 – 1) X (3 – 1), or 2 X 2 = 4
Crash Course In Statistics – The Chi Square – Quiz #4 (Coaster, Summer 2023)
Instructions: In your prior Crash Course Quizzes (#2 and #3), you focused on a study looking at
the excitation-transfer theory, or the idea that when a person becomes aroused physiologically
there is a subsequent period of time when the person will continue to experience a high state of
residual arousal yet be unaware of it. If additional stimuli are encountered during this time, the
individual may mistakenly attribute their residual arousal from the previous stimuli to future
stimuli. Using that same study design, complete the questions below and transfer your answers to
your Crash Course in Statistics – The Chi Square Quiz #4 in Canvas (1 point per question).
IMPORTANT: The answer options on Canvas may not be in the same order you see them below,
so make sure to copy over the CONTENT of the answer and not simply the answer letter (A, B,
C, D, or E).
Chi Square Crash Course Quiz Part A
You conduct a similar study using the same two groups we used for the t-Test. Recall that in that
study, participants were asked how much they would like to take a woman on a date. However,
some participants provided their ratings after riding a rollercoaster while others provided their
ratings while waiting in line to ride a rollercoaster. (Note: For the Chi Square Crash Course #3
Part A, ignore the “Waiting to ride” condition you saw in the ANOVA crash course quiz.).
But you wonder whether your participants are already in a relationship, as that might impact their
assessments of whether they would want to date the woman in the dating profile. Thus you ask
them, “Are you currently in a romantic relationship?” with 1 = Currently single and 2 = In a
relationship. You hope that there are no differences in relationship status between participants in
the just rode condition and the waiting to ride condition, so you run a chi square to assess this
possibility.
Note: If you want to run these analyses yourself, look for the SPSS file called “#4 Chi Square
Crash Course Data Coaster Summer A” in Canvas – Running the analysis is not required as the
data are presented below, but it is definitely recommended if you want some SPSS practice!).
You get the following data:
1). How many participants in the Just rode and Waiting to ride conditions are in a relationship?
A. A total of 11 participants (36.7%) in the just rode condition are in a relationship while
21 participants (30%) in the waiting to ride condition are in a relationship.
B. A total of 19 participants (63.3%) in the just rode condition are in a relationship while
9 participants (30%) in the waiting to ride condition are in a relationship.
C. A total of 19 participants (63.3%) in the just rode condition are in a relationship while
21 participants (70%) in the waiting to ride condition are in a relationship.
D. A total of 20 participants (33.3%) in the just rode condition are in a relationship while
40 participants (66.7%) in the waiting to ride condition are in a relationship.
E. A total of 30 participants (100%) in the just rode condition are in a relationship while
30 participants (10%) in the waiting to ride condition are in a relationship.
2). We used a chi square above to see if participants relationship-status (currently single versus
in a relationship) differed across our two rollercoaster conditions. Could we use a t-Test to also
assess this possibility? Select the appropriate answer.
A. Yes, we can run a t-Test. The t-Test relies on continuous variables, and the
relationship-status dependent variable is continuous (scaled).
B. Yes, we can run a t-Test. Since the new relationship-status dependent variable is a
dichotomous or nominal variable, a t-Test is appropriate to use
C. No, we cannot run a t-Test. Since the new relationship-status dependent variable here
is a continuous (scaled) response, so we cannot run a t-Test. A chi square is more
appropriate for this new dependent variable
D. No, we cannot run a t-Test. Since the new relationship-status dependent variable is a
categorical-based response (or a dichotomous or nominal variable), we cannot run a tTest. A chi square is more appropriate.
E. There is not enough information in this study to decide if we can run a t-Test
3). Which of the following represents the correct way to write out the results for this chi square
in an APA formatted results section?
A. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A
significant difference between conditions failed to emerge, χ2(4) = 0.30, p = .584. In the
just rode condition, 19 participants (or 63.3% of those in this condition) stated they were
in a relationship. In the waiting to ride condition, 21 participants (or 70% of those in this
condition) stated they were in a relationship. Phi showed a weak effect. This indicates that
participant relationship status was similar across both rollercoaster conditions.
B. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A
significant difference between conditions failed to emerge, χ2(1) = 0.30, p = .584. In the
just rode condition, 21 participants (or 70% of those in this condition) stated they were in
a relationship. In the waiting to ride condition, 19 participants (or 63.3% of those in this
condition) stated they were in a relationship. Phi showed a weak effect. This indicates that
participant relationship status was similar across both rollercoaster conditions.
C. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A
significant difference between conditions failed to emerge, χ2(1) = 0.30, p = .584. In the
just rode condition, 19 participants (or 63.3% of those in this condition) stated they were
in a relationship. In the waiting to ride condition, 21 participants (or 70% of those in this
condition) stated they were in a relationship. Phi showed a weak effect. This indicates that
participant relationship status was similar across both rollercoaster conditions.
D. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A
significant difference between conditions emerged, χ2(1) = 0.30, p = .005. In the just rode
condition, 19 participants (or 63.3% of those in this condition) stated they were in a
relationship. In the waiting to ride condition, 21 participants (or 70% of those in this
condition) stated they were in a relationship. Phi showed a strong effect. This indicates
that participant relationship status was significantly higher in the waiting to ride condition
than in the just rode condition.
E. A chi square test of independence was calculated to determine whether the relationshipstatus of the participants differed across the just rode and waiting to ride conditions. A
significant difference between conditions emerged, χ2(1) = 0.30, p = .005. In the just rode
condition, 21 participants (or 70% of those in this condition) stated they were in a
relationship. In the waiting to ride condition, 19 participants (or 63.3% of those in this
condition) stated they were in a relationship. Phi showed a strong effect. This indicates
that participant relationship status was significantly higher in the just rode condition than
in the waiting to ride condition.
Chi Square Crash Course Quiz Part B
You design a new study in which you look at all three conditions from the One-Way ANOVA
crash course quiz (Just rode, Waiting to ride, or Waiting for food). However, you also want to
see if participants experience different levels of physiological arousal. Therefore you alter your
dependent variable to the following: “Which of the following three options best describes your
current heartrate?: 1 = Beating very fast, 2 = Beating moderately fast, 3 = Beating slow”. You
get the following data (Note: This does differ from the tables used for questions 1, 2, and 3! If
you want to run the data yourself, use the “#4 Chi Square Crash Course Data Coaster Summer
B” in Canvas – not required, but recommended!):
4). You assume that participants in the Just rode condition will report having a very fast
heartrate, participants in the Waiting to ride condition will report having a moderately fast
heartrate, and participants in the Waiting for food condition will report having a slow heartrate.
Focusing on those cells, choose the option below that best represents the crosstabulation table.
A. A total of 20 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 6 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 1 participants (56.7%) in the Waiting
for food condition said their heart was beating slow.
B. A total of 20 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting
for food condition said their heart was beating slow.
C. A total of 30 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 30 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 30 participants (56.7%) in the Waiting
for food condition said their heart was beating slow.
D. A total of 20 participants (30%) in the Just rode condition said their heart was beating
very fast. A total of 6 participants (43.3%) in the Waiting to ride condition said their heart
was beating moderately fast. A total of 12 participants (26.7%) in the Waiting for food
condition said their heart was beating slow.
E. A total of 20 participants (100%) in the Just rode condition said their heart was beating
very fast. A total of 6 participants (100%) in the Waiting to ride condition said their heart
was beating moderately fast. A total of 1 participants (10%) in the Waiting for food
condition said their heart was beating slow.
5). Which of the following represents the correct way to write out the results for this chi square
in an APA formatted results section?
A. A chi square test of independence was calculated to see if participants heartrate
differed depending on their condition. A significant relationship emerged, χ2(1) = 42.08,
p < .001. A total of 20 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting
for food condition said their heart was beating slow. Cramer’s V was very strong. This
indicates that participant heartrate was fastest when they finished riding the rollercoaster
followed by participants who were about to ride the rollercoaster. Heartrates were slowest
for those waiting for food (i.e. not waiting to ride a rollercoaster).
B. A chi square test of independence was calculated to see if participants heartrate
differed depending on their condition. A significant relationship emerged, χ2(4) = 42.08,
p < .001. A total of 20 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting
for food condition said their heart was beating slow. Cramer’s V was very strong. This
indicates that participant heartrate was fastest when they finished riding the rollercoaster
followed by participants who were about to ride the rollercoaster. Heartrates were slowest
for those waiting for food (i.e. not waiting to ride a rollercoaster).
C. A chi square test of independence was calculated to see if participants heartrate
differed depending on their condition. A significant relationship emerged, χ2(4) = 42.08,
p < .001. A total of 20 participants (30%) in the Just rode condition said their heart was
beating very fast. A total of 6 participants (43.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 12 participants (26.7%) in the Waiting
for food condition said their heart was beating slow. Cramer’s V was very strong. This
indicates that participant heartrate was fastest when they finished riding the rollercoaster
followed by participants who were about to ride the rollercoaster. Heartrates were slowest
for those waiting for food (i.e. not waiting to ride a rollercoaster).
D. A chi square test of independence was calculated to see if participants heartrate
differed depending on their condition. A significant relationship emerged, χ2(4) = 90.00,
p < .001. A total of 20 participants (66.7%) in the Just rode condition said their heart was
beating very fast. A total of 19 participants (63.3%) in the Waiting to ride condition said
their heart was beating moderately fast. A total of 17 participants (56.7%) in the Waiting
for food condition said their heart was beating slow. Cramer’s V was very strong. This
indicates that participant heartrate was fastest when they finished riding the rollercoaster
followed by participants who were about to ride the rollercoaster. Heartrates were slowest
for those waiting for food (i.e. not waiting to ride a rollercoaster).
E. A chi square test of independence was calculated to see if participants heartrate
differed depending on their condition. A significant relationship failed to emerge, χ2(4) =
42.08, p > .05. A total of 20 participants (66.7%) in the Just rode condition said their
heart was beating very fast. A total of 19 participants (63.3%) in the Waiting to ride
condition said their heart was beating moderately fast. A total of 17 participants (56.7%)
in the Waiting for food condition said their heart was beating slow. Cramer’s V was very
weak. This indicates that participant heartrate did not differ significantly between the Just
rode, Waiting to ride, and Waiting for food conditions.