This week covers probability and non-probability sampling. Discuss in detail the characteristics ofprobability and nonprobability sampling. Discuss why researchers would use conditional probability
instead of unconditional probability in their study.
Embed course material concepts, principles, and theories (which require supporting citations) in your
initial response along with at least two scholarly, peer-reviewed journal articles. Keep in mind that these
scholarly references can be found in the Saudi Digital Library by conducting an advanced search specific
to scholarly references. Use Saudi Electronic University academic writing standards and APA style
guidelines.
Chapter 5
The Role of
Probability
Learning Objectives (1 of 3)
• Define the terms “equally likely” and “at
random”
• Compute and interpret unconditional and
conditional probabilities
• Evaluate and interpret independence of
events
• Explain the key features of the binomial
distribution model
Learning Objectives (2 of 3)
• Calculate probabilities using the binomial
formula
• Explain the key features of the normal
distribution model
• Calculate probabilities using the standard
normal distribution table
• Compute and interpret percentiles of the
normal distribution
Learning Objectives (3 of 3)
• Define and interpret the standard error
• Explain sampling variability
• Apply and interpret the results of the
Central Limit Theorem
Two Areas of Biostatistics
Goal: Statistical Inference
POPULATION
SAMPLE
=?
n, X
Descriptive Statistics
Sampling from a Population
SAMPLES
n
n
n
n
Population
N
n
n
n
n
n
n
Sampling:
Population Size = N, Sample Size = n
(1 of 2)
• Simple random sample
– Enumerate all members of population N
(sampling frame), select n individuals at
random (each has same probability of being
selected).
• Systematic sample
– Start with sampling frame; determine
sampling interval (N/n); select first person at
random from first (N/n) and every (N/n)
thereafter.
Sampling:
Population Size = N, Sample Size = n
(2 of 2)
• Stratified sample
– Organize population into mutually exclusive
strata; select individuals at random within
each stratum.
• Convenience sample
– Non-probability sample (not for inference)
• Quota sample
– Select a predetermined number of
individuals into sample from groups of
interest.
Basics
• Probability reflects the likelihood that
outcome will occur.
• 0 ≤ Probability ≤ 1
Number with outcome
Probability =
N
Example 5.1.
Basic Probability (1 of 2)
P(Select any child) = 1/5290 = 0.0002
Example 5.1.
Basic Probability (2 of 2)
P(Select a boy) = 2560/5290 = 0.484
P(Select boy age 10) = 418/5290 = 0.079
P(Select child at least 8 years of age)
= (846 + 881 + 918)/5290
= 2645/5290 = 0.500
Conditional Probability
• Probability of outcome in a specific
subpopulation
• Example 5.1.
P(Select 9-year-old from among girls)
= P(Select 9-year-old | girl)
= 461/2730 = 0.169
P(Select boy | 6 years of age)
= 379/892=0.425
Example 5.2.
Conditional Probability (1 of 2)
Example 5.2.
Conditional Probability (2 of 2)
P(Prostate cancer | Low PSA)
= 3/64 = 0.047
P(Prostate cancer | Moderate PSA)
= 13/41 = 0.317
P(Prostate cancer | High PSA)
= 12/15 = 0.80
Sensitivity and Specificity
Sensitivity = True positive fraction
= P(test+ | disease)
Specificity = True negative fraction
= P(test– | disease free)
False negative fraction = P(test– | disease)
False positive fraction = P(test+ | disease
free)
Example 5.4.
Sensitivity and Specificity
Sensitivity and Specificity
Sensitivity = P(test+ | disease) = 9/10 = 0.90
Specificity = P(test– | disease free)
= 4449/4800 = 0.927
False negative fraction = P(test– | disease)
= 1/10 = 0.10
False positive fraction = P(test+ | disease free)
= 351/4800 = 0.073
Independence
• Two events, A and B, are independent if
P(A | B) = P(A) or if P(B | A) = P(B)
Example 5.2.
• Is screening test independent of prostate
cancer diagnosis?
–
–
–
–
P(Prostate cancer) = 28/120 = 0.023
P(Prostate cancer | Low PSA) = 0.047
P(Prostate cancer | Moderate PSA) = 0.317
P(Prostate cancer | High PSA) = 0.80
Bayes’ Theorem (1 of 2)
• Using Bayes’ Theorem we revise or
update a probability based on additional
information.
– Prior probability is an initial probability.
– Posterior probability is a probability that is
revised or updated based on additional
information.
Bayes’ Theorem (2 of 2)
P(B | A)P(A)
P(A | B) =
P(B)
P(B | A)P(A)
P(A | B) =
P(A)P(B | A) + P(A’ )P(B | A’ )
Example (1 of 2)
• In Boston, 51% of adults are male.
• One adult is randomly selected to
participate in a study.
Prior probability of selecting a male = 0.51
Example (2 of 2)
• Selected participant is a smoker.
• 9.5% of males in Boston smoke as
compared to 1.7% of females.
• Find the probability that we selected a
male given he is a smoker.
Example: Find P(M | S)
• P(M) = 0.51 P(M’) = 0.49
P(S | M) = 0.095 P(S | M’) = 0.017
• Bayes’ Theorem
P(S | M)P(M)
P(M | S) =
P(M)P(S | M) + P(M’ )P(S | M’ )
0.095(0.51)
P(M|S) =
= 0.853
0.51(0.095) + 0.49(0.017)
• Knowing the participant smokes—increases P(M)
Example 5.8.
Bayes’ Theorem (1 of 3)
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
What is P(disease | test+)?
Example 5.8.
Bayes’ Theorem (2 of 3)
What is P(disease | test+)?
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
P(test+ | disease)P(disease)
P(disease | test+) =
P(test+)
Example 5.8.
Bayes’ Theorem (3 of 3)
P(disease) = 0.002
Sensitivity = 0.85 = P(test+ | disease)
P(test+) = 0.08 and P(test–) = 0.92
P(test+ | disease)P(disease)
=
P(test+)
Binomial Distribution (1 of 2)
• Model for discrete outcome
• Process or experiment has two possible
outcomes: success and failure.
• Replications of process are independent.
• P(success) is constant for each
replication.
Binomial Distribution (2 of 2)
• Notation
n = number of times process is replicated
p = P(success)
x = number of successes of interest
0≤x≤n
n!
P(x successes) =
p x (1 − p) n − x
x! (n − x)!
Example 5.9.
Binomial Distribution
• Medication for allergies is effective in reducing
symptoms in 80% of patients. If medication is
given to 10 patients, what is the probability it is
effective in 7?
10!
7
10-7
P(7 successes) =
0.8 (1- 0.8)
7!(10 – 7)!
= 120(0.2097)(0.008) = 0.2013
Binomial Distribution (1 of 4)
• Antibiotic is claimed to be effective in 70%
of the patients. If antibiotic is given to five
patients, what is the probability it is
effective on exactly three?
Success = Antibiotic is effective: n = 5, p = 0.7, x = 3
5!
P(X = 3) =
0.73 (1- 0.7)5-3
3!(5 – 3)!
= 10(0.343)(0.09) = 0.3087
Binomial Distribution (2 of 4)
• What is the probability that the antibiotic
is effective on all five?
5!
5
5-5
P(X = 5) =
0.7 (1- 0.7)
5!(5 – 5)!
=1(0.1681)(1) = 0.1681
Binomial Distribution (3 of 4)
• What is the probability that the antibiotic is
effective on at least three?
P(X ≥ 3) = P(3) + P(4) + P(5)
= 0.3087 + 0.3601 + 0.1681 = 0.8369
Binomial Distribution (4 of 4)
• Mean and variance of the binomial
distribution
m = np
s2 = np (1 – p)
For example, the mean (or expected)
number of patients in whom the antibiotic
is effective is 5*0.7 = 3.5
Normal Distribution (1 of 3)
• Model for continuous outcome
• Mean = median = mode
Normal Distribution (2 of 3)
Notation: = mean and s = standard
deviation
−3s
−2s −s
+s +2s +3s
Normal Distribution (3 of 3)
• Properties of normal distribution
I) The normal distribution is symmetric about the
mean (i.e., P(X > ) = P(X < ) = 0.5).
ii) The mean and variance, and s2, completely
characterize the normal distribution.
iii) The mean = the median = the mode.
P( - s < X < + s) = 0.68
P( - 2s < X < + 2s) = 0.95
P( - 3s < X < + 3s) = 0.99
iv) P(a < X < b) = the area under the normal curve
from a to b.
Example 5.11.
Normal Distribution (1 of 10)
• Body mass index (BMI) for men age 60 is
normally distributed with a mean of 29 and
standard deviation of 6.
• What is the probability that a male has BMI
less than 29?
Example 5.11.
Normal Distribution (2 of 10)
Example 5.11.
Normal Distribution (3 of 10)
P(X