Patient IDA0001
A0002
A0003
A0004
A0005
A0006
A0007
A0008
A0009
A0010
A0011
A0012
A0013
A0014
A0015
A0016
A0017
A0018
A0019
A0020
A0021
A0022
A0023
A0024
A0025
A0026
A0027
A0028
A0029
A0030
Smoking status Age
YES
NO
NO
YES
NO
YES
NO
YES
NO
YES
NO
YES
YES
NO
YES
NO
NO
YES
NO
YES
NO
YES
NO
YES
NO
YES
NO
YES
NO
YES
32
28
36
38
29
35
39
40
38
32
36
35
29
26
45
40
39
36
28
30
28
45
40
39
36
28
30
28
36
38
Albumin(g/dl)
Gender
3.5 Male
4 female
4.2 female
4.8 female
3.8 Male
4.3 Male
4 Male
4.5 female
3.5 female
2.8 Male
5 Male
3.4 female
3 Male
2.8 female
4 Male
2.5 female
3 Male
2.8 Male
3 female
4.5 Male
5 female
2.6 female
4.5 Male
2.5 female
2.8 female
4 Male
2.8 female
3 female
4 female
2.2 Male
BMI
Exercise
24.7 No
29.6 No
25.9 yes
20.3 No
22.5 Yes
22.5 yes
24.8 No
23.5 Yes
25.5 No
21.5 No
20.4 Yes
23.5 No
20.5 No
25.5 Yes
28.6 No
29.9 No
30 yes
30.3 No
28.9 Yes
23.8 yes
24.6 No
38.5 No
26.7 yes
22.8 yes
30.5 No
26.8 yes
25.6 No
27.8 Yes
24.6 yes
38.5 No
Systolic BP(mmHg)
120
128
126
110
136
118
120
116
128
126
120
118
110
110
138
140
139
140
140
118
120
138
120
116
126
110
116
129
120
138
التقنيات االرتباطية:اتفاقية القياس
Measuring Agreement: Correlatonal
Techniques
PHC 121: Introducton to Biostatstcs
1
Upon completon of this lecture, you will be able to:
•
•
•
•
•
•
Explain correlatonal analysis
Explain bivariate relatonships
Describe and assess scaterplots
Identiy diferent types oi associatons by examining a scaterplot
Identiy statstcal tests carried out in correlatonal analysis
Distnguish between Pearson’s oorrelaton ooefcient and Spearman’s
oorrelaton ooefcient
• Interpret the results oi correlaton analysis
• Describe partal correlaton
• Explain other uses oi correlatonal tests
PHC 121: Introducton to Biostatstcs
2
مقدمة في تحليل االرتباط
Introducton to Correlaton Analysis
PHC 121: Introducton to Biostatstcs
3
Introducton
• Correlaton analysis is used widely in the social and health sciences.
.تحليل االرتباط يستخدم على نطاق واسع في العلوم االجتماعية والصحية
• oorrelatonal techniques look at relatonship or associaton between
.تقنيات االرتباط تنظر إلى العالقة أو االرتباط بني املتغيرات
variables.
• They do not look at diferences between means. .إنهم ال ينظرون إلى االختالفات بني املتوسطات
• There are no independent or dependent variables in the usual sense oi the
word.
.ال توجد متغيرات مستقلة أو تابعة باملعنى املعتاد للكلمة
PHC 121: Introducton to Biostatstcs
4
• For example, ii we took a sample oi people and asked them to fll in
أخذنا عينة من األشخاص وطلبنا منهم أن ينزلوا في األسئلة فيما يتعلق ب،على سبيل املثال
questonnaires relatng to
أ( اإلجهاد ب( السعادة
a) stress
b) happiness
then we can enter the data and fnd out whether stress and happiness are
ثم يمكننا إدخال البيانا
related or correlated with each other. ضت وتوضيح ما إذا كان التوتر والسعادة كذلك ذات صلة أو
مرتبطة ببعضها البع
• Ii we iound that stress and happiness were highly associated, in that the
tendency was ior happier people to be less stressed and people who were
not so happy were more stressed.
• We couldn’t say whether low stress causes people to be happier, or
whether being happier causes a person to be less stressed.
PHC 121: Introducton to Biostatstcs
5
Bivariate Relatonships
• A bivariate relatonship is where one variable shows an associaton or
correlaton with a second variable.
• Bivariate correlaton techniques assess the strength and magnitude oi the
associaton or relatonship between two variables, and the associated p-value
shows us whether such a relatonship is likely to be due to sampling error (or
chance).
• oorrelaton techniques are not used to asses diferences between variables.
PHC 121: Introducton to Biostatstcs
6
• When researchers have correlatonal hypotheses, i.e., they are looking ior or
expectng there to be a relatonship between variables, then their study is
correlatonal.
• For example, Mok & Lee (2008) examined the relatonship between anxiety and
pain intensity in patents with low back pain.
• oorrelaton analyses are ofen directonal.
• When researchers state a directonal hypothesis, they are able to use a onetailed level oi signifcance when assessing results; ii they simply predict a
relatonship but haven’t got any logical reason to predict the directon oi the
relatonship, then they use a two-tailed level oi signifcance.
PHC 121: Introducton to Biostatstcs
7
Scatterplot
PHC 121: Introducton to Biostatstcs
8
Scatterplot
•
A scaterplot is used to visualize the relatonship between two variables.
•
A scaterplot, also called scatergram, has a horizontal axis called x-axis
and a vertcal axis called y-axis.
•
Scaterplot (or Figure 10.1) shows us the positve relatonship between
social support and happiness.
•
Each data point represents one person’s score.
PHC 121: Introducton to Biostatstcs
9
•
We can also predict a negatve relatonship, ior instance, we predict a
negatve relatonship between social support and iatgue.
•
This means that people with low score on social support would tend to
score high on iatgue, and conversely people who score high on social
support would tend to have low score on iatgue.
•
Figure 10.2 shows negatve relatonship between social support and
iatgue.
PHC 121: Introducton to Biostatstcs
10
Figure 10. 1. Scaterplot showing relatonship
between social support and happiness.
Figure 10. 2. Scaterplot showing relatonship
between social support and fatgue.
PHC 121: Introducton to Biostatstcs
11
• Imagine a situaton where
there is not associaton
between variables.
• In this case, you would
expect the scaterplot to
show no partcular trend
and the data point would
be randomly distributed
in the plot (Figure 10.3).
Figure 10. 3. Scaterplot showing no partcular
trend.
PHC 121: Introducton to Biostatstcs
12
Statstcal Tests
•
While periorming a correlaton analysis, the strength oi the relatonship
between variables is assessed, not by scaterplot, but by periorming a
statstcal test.
•
The parametric test which assesses correlatonal relatonship is called
Pearson’s Product Moment Correlaton (or Pearson’s r in short).
•
Ii the data are skewed, it is recommended that you use a non-parametric
equivalent called Spearman’s rho.
PHC 121: Introducton to Biostatstcs
13
• The strength oi a correlatonal relatonship is measured on a scale
irom
0 (no relatonship) to +1 (periect positve relatonship), and
irom
0 (no relatonship) to
-1 (periect negatve relatonship).
• The nearer to 1, whether positve or negatve, stronger the relatonship,
and the nearer to 0, weaker the relatonship. (see Figure 10.4)
• Figure 10.5 and Figure 10.6 represent the periect positve and the
periect negatve correlatons, respectvely.
PHC 121: Introducton to Biostatstcs
14
Figure 10. 4. Strength
correlaton coefcients.
oi
Figure 10. 5. Scaterplot showing Figure 10. 6. Scaterplot showing
periect positve correlaton.
periect negatve correlaton.
PHC 121: Introducton to Biostatstcs
15
Partal Correlaton
•
Partal correlaton is the correlaton between two variables whilst controlling
ior a third (or more) variables.
•
We also call this partalling or co-varying.
•
For example, researchers might want to look at the relatonship between
symptoms and quality oi liie, with depression held constant (partally out,
covaried).
•
Ii the correlaton between symptoms and quality oi liie reduces afer
partalling out depression, then the conclusion is that part oi the
relatonship between symptoms and quality oi liie is due to depression.
PHC 121: Introducton to Biostatstcs
16
Spearman’s Rho
•
Quite ofen in the health sciences, researchers have small samples which
are ofen skewed, hence the use oi Spearman’s Rho is quite common.
•
This is a non-parametric test, which makes no assumptons oi normality.
•
All thing being equal, non-parametric tests are not as poweriul as
parametric ones.
•
However, in circumstances when it is appropriate to use non-parametric
tests, they can actually be more poweriul.
PHC 121: Introducton to Biostatstcs
17
Other Uses of Correlatonal Techniques
PHC 121: Introducton to Biostatstcs
18
Reliability of Measures
•
In a healthy sample oi individuals you would expect these to be a high
associaton between body temperature today and body temperature
tomorrow or in a week or in a month.
•
There will be variability oi course, but you would expect this measure to
have a high correlaton.
•
Ii the correlaton coefcient were 0.90, this would indicate high reliability.
PHC 121: Introducton to Biostatstcs
19
Test-retest Reliability
•
When designing questonnaires, researchers need to ensure their
reliability.
•
Part oi the procedure is to give the questonnaires to people at tme 1,
and then to re-test them later.
•
This kind oi reliability is called the test-retest reliability.
•
The scores on the questonnaire(s) at tme 1 and tme 2 are correlated,
and Pearson’s r will show the degree oi reliability.
•
It is usual to accept 0.7 or above as being oi good reliability.
PHC 121: Introducton to Biostatstcs
20
Internal Consistency
•
Sometmes, researchers design a questonnaire with discrete scales
incorporated into it.
•
For instance, the Hospital Anxiety and Depression Scale (HADS) is
composed oi two scales – items relatng to depression and items relatng
to anxiety.
•
In this case, the seven items relatng to depression should show a high
correlaton between the depression items.
PHC 121: Introducton to Biostatstcs
21
•
The seven items relatng to anxiety also should also show a high
correlaton between the anxiety items.
•
This sort oi reliability measures internal consistency.
•
Cronbach’s alpha is ofen used reported when researchers want to give a
measure oi internal consistency.
•
As with other correlaton coefcients, a scale should have a value oi >0.7
in order to be considered reliable.
PHC 121: Introducton to Biostatstcs
22
Inter-rater reliability
•
•
Ofen used in observatonal studies.
This type oi reliability measure is where two or more raters are observing
behaviour oi some sort.
• For instance, two or more diferent nurses may be observing patents and
ratng them on a partcular questonnaire.
PHC 121: Introducton to Biostatstcs
23
Validity
•
oorrelatonal analysis can also help with validity.
•
One way to assess this is to correlate the questonnaire with other
established questonnaire with known reliability.
•
For example, ii you were to design a questonnaire measuring depression,
you could give your questonnaire to partcipants along with HADS and/or
the oentre ior Epidemiologic Studies Depression Scale (oES-D).
•
Ii your questonnaires really measures depression, there should be a high
degree oi correlaton between newly design questonnaire and these
established questonnaire.
PHC 121: Introducton to Biostatstcs
24
Percentage Agreement / Cohen’s Kappa
•
A simple percentage agreement is easily calculated.
•
You simply count up the number oi tmes the raters agreed and divide
this by the total number oi observatons and then multply by 100.
•
A more reliable way to assess agreement than simple percentage
agreement is Kappa.
•
This uses a correlaton coefcient as a measure ior agreement.
•
Kappa is more reliable because the iormula corrects the percentage oi
agreement due to chance.
PHC 121: Introducton to Biostatstcs
25
•
It does this by looking at the observed percentage oi agreement and
subtractng the percentage oi agreements which would be expected by
chance alone.
•
The calculaton involves dividing the resultng fgure by 1 minus the
percentage oi agreements which would be expected by chance alone.
•
Kappa is sometmes called the chance-corrected percentage oi
agreement.
•
SPSS can calculate all these statstcs under the Reliability Analysis
procedure.
PHC 121: Introducton to Biostatstcs
26
Additonal Informaton
All iniormaton provided in this presentaton was obtained irom the iollowing
source:
Dancey o, Reidy J , Rowe R (2012) Statstcs ior the Health Sciences: A NonMathematcal Introducton.
*For additonal iniormaton on any oi the topics mentoned in this
presentaton, please reier to ohapter Ten: Measuring Agreement:
Correlatonal Techniques.
PHC 121: Introducton to Biostatstcs
27
College of Health Sciences
Department of Public Health
ASSIGNMENT
COVER SHEET
Course name:
Introduction to Biostatistics
Course number:
PHC 121
CRN
Use the dataset provided to answer the following questions: Q1- Q3
A study was conducted on patients with liver disease and some of their
observations were included in the attached excel sheet.
Q1. Write a statistical summary for participants’ demographic and health
Assignment title or task:
(You can write a
question)
details that include important variables such as Albumin levels (g/dl), age,
gender & smoking status?
Mark -3
Q2. Compute most appropriate measure of central tendency & dispersion to
report the variable “Albumin level”?
Mark – 2
Q3. What is the appropriate statistical test to identify any significant
difference between Gender groups and patients’ Albumin Levels? Write down
the assumptions of the proposed statistical test? Formulate null hypothesis &
alternative hypothesis for such an association?
Mark – 3
Q4. Discuss correlation analysis and mention its applications in health
research with an appropriate example?
Marks -2
Student name:
XXXXX
Student ID:
XXXXX
Submission date:
XXXXX
College of Health Sciences
Department of Public Health
Instructor name:
Hadia Meashi
Grade:
…. Out of 10
Submission Guidelines:
•
•
•
•
•
•
Write your answers in 1-2 page(s) not less than 200 words and maximum of 500 words
excluding references and data tables with a font of Times New Roman and size of 12. The
line spacing should be 1.5.
Heading should be Bold and text color should be Black.
AVOID PLAGIARISM.
Assignments must be submitted with the filled Cover Page (Students information should
be filled on the first page of the document)
Assignments must carry at least TWO up to date references using APA style. Please see
below web link about how to cite APA reference style. Click or tap to follow the link:
https://guides.libraries.psu.edu/apaquickguide/intext.
Assignment should be submitted on time by the end of Week 10
(Saturday 3/06/2023 at 11:59 pm)
College of Health Sciences
Department of Public Health