The relationship between the z-test and the t-test:
- The z-test is a statistical test that uses the normal
distribution.
- It is a method to test whether the “sample mean” and the “population
mean” are statistically significantly different.
The most important assumption for using the z-test:
- The population mean and standard deviation are known (an assumption
that is rarely realistic).
→ Knowing the true population standard deviation (σ) is generally not
practical.
- The sample must be a simple random sample drawn from the
population.
- The population must follow a normal distribution.
- The z-test is only used when the population parameters are fully
known.
→ The assumptions for the z-test are often unrealistic.
→ Instead of the z-test, the Student’s t-test is used.
- Before learning the t-test, it is necessary to understand the
standard normal distribution, z-scores, and the z-distribution.
2. “Normal Distribution” and “Standard Normal Distribution”
2.1 Data Generation.
- Here, let’s create fake test results (score) for 10,000 imaginary
students.
- A “normal distribution” with a population standard deviation of 10
and a population mean of 50.
- Using the
rnorm()
function in R, we will generate data
from this population for analysis.
To ensure reproducibility of your results, you can set a seed using
the set.seed() function in R. Here’s the modified code with a seed for
generating the same results every time you run it:
r
set.seed(1982) # To ensure reproducibility of your results, you can set a seed using the `set.seed()`
score <- rnorm(10000, # Create data for 10,000 imaginary students
mean = 50, # Population mean = 50
sd = 10) |> # Population standard deviation = 10
round(digits = 0) # Round the score to the nearest whole number (truncate the decimal places)
st_id = seq(1:10000) # Create student IDs
df <- tibble(score, st_id) # Combine the two variables created into a data frame (df)
- Display Descriptive Statistics of
df
.
[1] 9.975603
- The standard deviation is approximately 10
Min. 1st Qu. Median Mean 3rd Qu. Max.
16 43 50 50 57 87
- The range of test scores is from 16 to 87 points
- The mean is 50 points.
2.2 Standardization
- Standardizing the Test Scores (score)
- “Standardizing the test scores” means converting the test score to a
z-score
Standard Normal Distribution
- The standard normal distribution is the distribution of z-scores,
after applying a transformation to the normal distribution
(standardization).
- The features of standard normal distribution are:
・The mean is 0
・The standard deviation is 1.
・The range is from -4 to 4
test score
=>
z-scores
・The mean of the scores is 50
=> The mean of the z-scores
is 0
・The standard deviation of the scores is 10
=> The standard deviation of the z-scores
is 1
・The range of the scores is from 16 to 87 points
=> The range of the z-scores is from −4
to 4
Formula for Standardization \[z = \frac{individual.data − mean}{standard.
deviation}\]
- For example, let’s convert a test score of 30 into a
z-score
[1] -2
- It can be found that a test score of 30 corresponds to a
z-score
of -2
- The
z-score
corresponding to the average test score of
50 is 0
- After standardization, the test scores will follow a standard normal
distribution.
2.3 How to Read the Standard Normal Distribution Table
2.3.1 When z = 1 (1\(\sigma\))
- For “z = 1 standard deviation” – the value at the intersection of “z
= 1.00” and “0.00” is “0.3413”
- This number means that the area
between z-values of 0 and 1.0 is 0.3413」
- The total area under the bell curve is 1
- Out of that, 0.3413 is the area between z = 0 and z = 1
- Since probabilities can be represented by areas, 34.13% of the total
100% probability lies between z = 0 and z = 1.
Standard Normal Distribution and Standard
Deviation: When (\(\sigma\))= 1
・When the random variable \(X\)
follows \(N(μ, σ²)\)
→ meaning “X follows a distribution with a mean of \(μ\) and a standard deviation of \(σ\)”
The probability that \(X\) falls within the range of ±1σ (± 1
standard deviation) from the mean \(μ\)is 68.26%
- The probability that \(X\) falls
within the range from 0 to \(1σ\)
deviation from the mean \(μ\) is
34.13%
→ The probability that \(X\) falls within the range from \(-1σ\) to \(1σ\) deviation (±1 standard deviation) from
the mean \(μ\) is 34.13 × 2 =
68.26%
2.3.2 When z = 2 (2\(\sigma\))
- For “z = 2 standard deviation” – the value at the intersection of “z
= 2.00” and “0.00” is “0.4772”
- This number means that the area
between z-values of 0 and 2.0 is 0.4772
- The total area under the bell curve is 1
- Out of that, 0.4772 is the area between z = 0 and z = 2
- Since probabilities can be represented by areas, 47.7% of the total
100% probability lies between z = 0 and z = 2
Standard Normal Distribution and Standard
Deviation: When (\(\sigma\))= 2
・When the random variable \(X\)
follows \(N(μ, σ²)\)
→ meaning “X follows a distribution with a mean of \(μ\) and a standard deviation of \(σ\)”
The probability that \(X\) falls within the range of ±2σ (± 2
standard deviation) from the mean \(μ\)is 95.44%
- The probability that \(X\) falls
within the range from 0 to \(2σ\)
deviation from the mean \(μ\) is
47.7%
→ The probability that \(X\) falls within the range from \(-2σ\) to \(2σ\) deviation (±1 standard deviation) from
the mean \(μ\) is 47.7 × 2 =
95.44%
2.3.3 When z = 3 (3\(\sigma\))
- For “z = 3 standard deviation” – the value at the intersection of “z
= 2.00” and “0.00” is “0.4987”
- This number means that the area
between z-values of 0 and 3.0 is 0.4987
- The total area under the bell curve is 1
- Out of that, 0.4987 is the area between z = 0 and z = 3
- Since probabilities can be represented by areas, 49.87% of the total
100% probability lies between z = 0 and z = 3
Standard Normal Distribution and Standard
Deviation: When (\(\sigma\))= 2
・When the random variable \(X\)
follows \(N(μ, σ²)\)
→ meaning “X follows a distribution with a mean of \(μ\) and a standard deviation of \(σ\)”
The probability that \(X\) falls within the range of ±2σ (± 3
standard deviation) from the mean \(μ\)is 99.74%
- The probability that \(X\) falls
within the range from 0 to \(3σ\)
deviation from the mean \(μ\) is
49.87%
→ The probability that \(X\) falls within the range from \(-3σ\) to \(3σ\) deviation (±1 standard deviation) from
the mean \(μ\) is 49.87 × 2 =
99.74%
2.3.4 Summary
Standard Normal Distribution & Standard
Deviation (\(\sigma\))
・When the random variable \(X\)
follows \(N(μ, σ²)\)
→ meaning “X follows a distribution with a mean of \(μ\) and a standard deviation of \(σ\)”
The probability that \(X\) falls within the range from \(-1σ\) to \(1σ\) deviation (±1 standard deviation) from
the mean \(μ\) is 34.13 × 2 =
68.26%
The probability that \(X\) falls within the range from \(-2σ\) to \(2σ\) deviation (±1 standard deviation) from
the mean \(μ\) is 47.7 × 2 =
95.44%
The probability that \(X\) falls within the range of ±2σ (± 3
standard deviation) from the mean \(μ\)is 99.74%
2.4 偏差値 (Hensachi: deviation value)
- The “deviation value” (
hensachi
) used in the Japanese
entrance exam system is a modified version of the
z-distribution
, with the mean and standard deviation
adjusted as follows.
|
mean |
Standard Deviation |
z-distribution |
0 |
1 |
|
↓ |
↓ |
hensachi |
50 |
10 |
|
|
|
Formula for Standardization \[z =
\frac{individual.data−mean}{standard.deviation}\]
Formula for Calculating Hensachi
\[Hensachi =
10*\frac{individual.data−mean}{standard.deviaiton} + 50\]
If the deviation value (hensachi) is 60:
- 68% of the total falls within “the mean (50 points) ± 1 standard
deviation.”
→ This means you are in the top (100 - 68.26) / 2 = 32 / 2 =
approximately 16%.
If the deviation value is 70:
- 95% of the total falls within “the mean (50 points) ± 2 standard
deviations.”
→ This means you are in the top (100 - 95.44) / 2 = 4.56 / 2 =
approximately 2.28%.
If the deviation value is 80:
- 99.7% of the total falls within “the mean (50 points) ± 3 standard
deviations.”
→ This means you are in the top (100 - 99.74) / 2 = 0.26 / 2 =
approximately 0.13%.
3. ttest
- According to the Central Limit Theorem introduced in the previous
session, even if the population is not normally distributed, the
distribution of the standard error in a random sample from a population
with a mean \(𝜇\) and variance σ² will
approximately follow a normal distribution, provided the sample size is
sufficiently large.
- However, it’s not always the case that we have a large enough sample
size, and often, we can only obtain small sample sizes.
- This issue was resolved by William Sealy Gosset, a former Guinness
Brewery employee, who devised the
t-test
.
- He published his work under the name Student, which is why it’s
called the Student’s
t-test
.
- Gosset discovered that even with small sample sizes, the
distribution of the standard error in a random sample from a population
follows a t-distribution with
(n−1)
degrees of
freedom.
- Thanks to this discovery, statistical estimation became possible
using the characteristics of the
t-distribution
, even with
small sample sizes.
- When the sample size exceeds 100, the
z-test
using the
characteristics of the standard normal distribution was used, while the
t-test was only employed when the sample size was small.
- The reason was that calculating the t statistic for the
t-test
required constructing large
t-distribution
tables manually, which was a complex
task.
- However, with the development of statistical software like R and
Stata, calculating
t-values
from
t-distribution
tables became much easier, and as a result,
the z-test
became obsolete.
- As shown in the
t-distribution
chart below, as the
degrees of freedom (the sample size minus 1) increase, the results of
the t-test and z-test
converge, and when the sample size is
infinite, both distributions become identical.
3.1 One-Sample t-test and Significance Level
Procedure for t-test
- Suppose we extract a sample of 10 data points, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, from a normally distributed population with a population mean
of μ = 5.5.
- Now, we want to check whether the population mean is 5, even though
the sample mean obtained from the sample size of 10 is 5.5.
- To verify this, we need to calculate the
t-value
.
6 Steps for the t-test
:
- Clearly state the null hypothesis: (\(H_0\))
- Clearly state the alternative hypothesis(\(H_1\))
- Calculate the
t-value
.
- Identify the critical values to reject the null hypothesis.
- Check whether the
t-value
falls within the rejection
region.
- Draw a conclusion.
Specific Testing Process:
- First, we set the null hypothesis.
- The null hypothesis (\(H_0\)) is
the hypothesis that is set to be rejected.
- What we want to know is: “The sample mean is 5.5, but is the
population mean 5?”
- The null hypothesis (\(H_0\)) is
“Population mean = 5.”
- Next, we set the alternative hypothesis (\(H_1\)).
- The alternative hypothesis is “The population mean is not 5.”
- The null hypothesis (\(H_1\)) and
the alternative hypothesis (\(H_1\))
are mutually exclusive.
- Next, we calculate the
t-value
.
- The
t-value
can be calculated using the following
formula:
\[T = \frac{\bar{x} - μ_0}{SE} =
\frac{\bar{x} - μ_0}{u_x / \sqrt{n}}\]
\(\bar{x}\) : sample mean (in
this case, 5.5)
\(μ_0\) : The value we want to
estimate for the population (in this case, 5)
\(n\) : Sample size (in this
case, 10)
\(u_x\): Unbiased standard
deviation (= sample standard deviation)
\(SE\) :
standard Error: SE
The unbiased standard deviation \(u_x\) is the square root of the unbiased
variance, so first, we calculate the unbiased variance.
The unbiased variance of x (\(u_x^2\)) can be calculated using the
following formula:
\[u_x^2 = \frac{\sum_{i=1}^n (x_i -
\bar{x})^2}{n-1}\]
- From this, the unbiased standard deviation of x, \(u_x^2\) = 3.03, is obtained.
- Now, we want to check whether the population mean is
5.
→ By substituting 5 for the population mean \(\mu\), we want to estimate, we obtain the
following t-value
:
\[T = \frac{\bar{x} - μ_0}{u_x /
\sqrt{n}}\]
\[ = \frac{{5.5} - 5}{3.03 /
\sqrt{10}}\]
\[ = 0.522\]
Point Estimation
- For interval estimation, refer to “Estimation of Population Mean and
Confidence Interval”
- Using the obtained
t-value
of 0.522, we will estimate
whether the hypothesis that the population mean is 5 is valid, based on
the sample of size 10 (1, 2, 3, 4, 5, 6, 7, 8, 9, 10).
- The values in the
t-distribution table
indicate the
critical values for rejection, corresponding to the significance level
used in the test.
- If the absolute value of the
t-value
obtained from the
sample is greater than the critical value, the null hypothesis is
rejected (for a two-tailed test).
- The horizontal axis of the table shows the significance level for a
one-tailed test.
- For example, the second column “Probability 95%” indicates that the
significance level for a two-tailed test is 5% (α = 0.05).
- In
t-tests
, it is common to use a significance level of
5% (α = 0.05) for two-tailed tests.
- Therefore, we use the “Probability 95%” from the second column of
the
t-distribution table
.
- The first and the fourth column of the table
show the degrees of freedom (
df
).
- Here, the degrees of freedom are “9,” which is the sample size of 10
minus 1.
- The number at the intersection of the row for 9 degrees of freedom
and the column for “Probability 95%” is 2.262, which is the critical value
for a two-tailed test at the 5% significance level.
- Graphically, this can be illustrated as follows:
- The following figure shows the two critical values, -2.26 and 2.26,
for a t-distribution with 9 degrees of freedom at a 5% significance
level (α = 0.05).
- If the
t-value
obtained from the sample falls within
the range of -2.26 to 2.26, the null hypothesis \(H_0\), ‘population mean = 5,’ cannot be
rejected.
- If the
t-value
falls outside this range (the red
hatched area), the null hypothesis will be rejected.
- Since the
t-value
is 0.522, the null hypothesis \(H_0\) cannot be rejected.
- Therefore, based on the sample obtained, it cannot be ruled out that
the population mean is 5.
- In other words, based on the sample obtained, the population mean could be 5.
3.2 3.2 Significance Level and P-value
Significance Level
- A “significance level of 5%” means that if the null hypothesis \(H_0\) is true, there is a 5% risk (5 times
out of 100) of mistakenly rejecting the null hypothesis by chance.
- Therefore, the significance level is also referred to as the “risk
level” (\(\alpha\): alpha).
- This error is called a “Type I error.”
- If the risk of mistakenly rejecting a true null hypothesis falls
below 5%, the null hypothesis \(H_0\)
is rejected (i.e., the hypothesis is nullified), and the alternative
hypothesis is accepted.
- It is important to note that some statistics books or websites
describe the significance level as equivalent to the
p-value
, so caution is needed.
Xsignificance probability = p-value
- The
p-value
represents “the probability, when the null
hypothesis is true, that the test statistic takes a value more extreme
than the t-value obtained from the sample.”
・p-value
indicates the extremeness
of the t-value
under the null hypothesis, and the
smaller the p-value
, the more extreme the
t-value
obtained from the sample.
- Therefore, if the extremeness of the
t-value
is
sufficiently large (i.e., if the p-value
is sufficiently
small), the null hypothesis is rejected.
- The
p-value
is neither “the probability that the null
hypothesis is true” nor “the probability that the alternative hypothesis
is wrong.”
- The “probability that the null hypothesis is true” is either 0 or 1,
and it cannot take a value between 0 and 1 like the p-value.
- The truth is that either the null hypothesis is correct, or it is
incorrect.
- If the null hypothesis is true, then “the probability that the null
hypothesis is true” is 1, and if not, it is 0.
- Since we do not know the truth (i.e., the population parameter), it
is impossible to know whether this probability is 0 or 1.
3.3 T-test using R
# Name the sample of size 10 as score
score <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Test the population mean = 5 (Confidence Interval = 95% → 5%
significance level)
# Perform a t-test for the null hypothesis (population mean = 5)
t.test(score, mu = 5)
One Sample t-test
data: score
t = 0.52223, df = 9, p-value = 0.6141
alternative hypothesis: true mean is not equal to 5
95 percent confidence interval:
3.334149 7.665851
sample estimates:
mean of x
5.5
- In line 3, the result shows a
t-value = 0.52223
and a
p-value = 0.6141
.
- The
p-value
of 0.6141 represents the
p-value
for a two-sided test.
- In line 4, the alternative hypothesis is explicitly stated as:
alternative hypothesis: true mean is not equal to 5
.
- Since the
p-value (0.6141)
is greater than 0.05, the
null hypothesis (“population mean = 5”) cannot be rejected.
- If the
p-value
were smaller than 0.05, the null
hypothesis could be rejected at the 5% significance level (\(\alpha\) = 0.05).
- Therefore, based on the sample obtained, it cannot be ruled out that
the population mean is 5.
- In other words, based on the sample obtained, the population
mean could be 5.
Testing Population Mean = 3
(Confidence Interval = 95% → Statistically Significant at 5%)
# Perform a `t-test `for the null hypothesis (population mean = 3)
t.test(score, mu = 3)
One Sample t-test
data: score
t = 2.6112, df = 9, p-value = 0.02822
alternative hypothesis: true mean is not equal to 3
95 percent confidence interval:
3.334149 7.665851
sample estimates:
mean of x
5.5
- In line 3, the result shows a t-value = 2.6112 and a p-value =
0.02822.
- The p-
value
of 0.02822 represents the
p-value
for a two-sided test.
- In line 4, the alternative hypothesis is explicitly stated as:
alternative hypothesis: true mean is not equal to 3.
- Since the
p-value (0.02822)
is smaller than 0.05, the
null hypothesis (“population mean = 3”) can be rejected.
- Therefore, based on the sample obtained, the population mean is not
equal to 3.
- In other words, based on the sample obtained, the population
mean is not 3 and is likely to be greater than 3.
4. Estimation of Proportion
Cabinet
Approval 29%, Disapproval 52% (NHK Public Opinion Poll)
- Cabinet approval rating: 29%, disapproval rating: 52% (NHK public
opinion poll article)
- According to the NHK public opinion poll, 29% of respondents said
they “support” the Suga Cabinet, down 4 points from last month, marking
the lowest level since the Cabinet was formed in September 2020.
- The survey targeted 2,115 people, and responses were received from
57%, or 1,214 people. (Updated on August 10, 2021)
Q: Can we conclude from this poll that the approval rating of the
Suga Cabinet is below 30%?
- Let’s perform an estimation of proportions.
- Those who answered “support” were 29% of the 1,214 respondents,
which is 352 people.
prop.test(c(352), c(1214))
1-sample proportions test with continuity correction
data: c(352) out of c(1214), null probability 0.5
X-squared = 213.41, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.2647212 0.3165265
sample estimates:
p
0.2899506
- The 95% Confidence Interval for the Cabinet Approval Rating is
26.47% ~ 31.66%
- Even though this survey produced an
estimated approval rating of 29%, we cannot say with 95% confidence that
“the approval rating of the Cabinet among the entire electorate is below
30%.”
- Suppose, instead of 352 people, 300 people had answered that they
“support” the Suga Cabinet.
prop.test(c(300), c(1214))
1-sample proportions test with continuity correction
data: c(300) out of c(1214), null probability 0.5
X-squared = 309.53, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.2232793 0.2725771
sample estimates:
p
0.247117
- The 95% Confidence Interval for the Cabinet Approval Rating is
22.33% ~ 27.26%
- With this result, we can say with 95% confidence that “the approval rating of the Cabinet among the entire
electorate is below 30%
5. Excercise
Q5.1: Referencing the Z-distribution, solve the following
problems.
- All first-year high school students in Prefecture B took a math test
at a certain preparatory school.
- After grading, it was found that “the results of the math test”
follow a normal distribution with a mean of 45 points and a standard
deviation of 10.
- Answer the question: What is the proportion (i.e., probability) of
students who scored 63 points or higher?
Q5.2: Referencing the Z-distribution, solve the following
problems.
- All first-year high school students took a math test at a certain
preparatory school. After grading, it was found that “the results of the
math test” follow a normal distribution with a mean of 45 points and a
standard deviation of 10.
Q5.2.1: What is the proportion of students who
scored 70 points or higher?
Q5.2.2: What is the proportion of students who scored
50 points or higher?
Q5.2.3: What is the proportion of students who scored
40 points or lower?
Q5.2.4: What score is required to be in the top 5%?
Q5.3: Referencing the t-test, solve the following problem.
- A student from Waseda University, Mr. A, opened a café called
“Rouault” in the Okuma shopping street.
- Mr. A assumed that the sales of “Rouault” follow a normal
distribution and aimed to estimate the population mean \(\mu\) as a representative measure of
sales.
- He randomly selected 8 receipts from the café’s total daily sales,
and the following numbers were obtained.
\[45,39,42,57,28,33,40,52 (unit:
10.thousand.yen)\]
- Test the hypothesis that the population’s daily sales for “Rouault”
is 500,000 yen.
Q5.4: Referencing the t-test, solve the following problem.
- Assume that 10 data points (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) are
sampled from a normal population with a population mean \(\mu\) = 5.5
- Here, the sample mean obtained from the sample size of 10 is 5.5,
but is the population mean actually 3?
- Use R to verify this.
Q5.5: Referencing the Z-distribution
, solve the
following problems.
- 20,000 students took the entrance exam for the School of Political
Science and Economics at Waseda University, and their scores followed a
normal distribution with a mean of 65 points and a standard deviation of
10 points.
Q5.5.1: What is the probability that a student’s
score is between 75 and 85 points?
Q5.5.2: In this entrance exam, what score is required
to be in the top 10%?
Q5.5.3: In this entrance exam, the top 1,000 students
are admitted. What score is required to be admitted?
References
宋財泫 (Jaehyun Song)・矢内勇生
(Yuki Yanai)「私たちのR: ベストプラクティスの探究」
土井翔平(北海道大学公共政策大学院)「Rで計量政治学入門」
矢内勇生(高知工科大学)授業一覧
浅野正彦, 矢内勇生.『Rによる計量政治学』オーム社、2018年
浅野正彦, 中村公亮.『初めてのRStudio』オーム社、2018年
Winston Chang, R Graphics Cookbook, O’Reilly Media, 2012.
Kieran Healy, DATA VISUALIZATION, Princeton, 2019
Kosuke Imai, Quantitative Social Science: An Introduction, Princeton
University Press, 2017