Onesample Tests
Statistical tests with one sample: ztest
1. Introduction
In the onesample ttest article, we have introduced the procedure for the onesample ttest, which is a hypothesis test for one population mean when the population standard deviation is not provided (which is the more common situation). However, in the rare event that we are provided with the population standard deviation, we can take advantage of this extra information and perform a more (theoretically) accurate hypothesis test: the onesample ztest. Let’s motivate this using the same example from the onesample ttest article. Here is the story again:
One day, while waiting for your bus to work, you notice a sign on the bus stop saying the average wait time for the arrival of a bus is 12 minutes. Being an experienced bus rider, you believe this claim is nonsense (you know, waiting for a bus to come is like waiting for the end of a work day), and you decide to dispute this claim. For the next 50 days, you (on purpose) miss a bus and record the time it takes for the next bus to arrive, and you get an average of 13.5 minutes. Now, is it enough to say the sign is wrong? In particular, is there enough evidence to claim that the waiting time should be longer?
2. One Sample Ztest
In this article, we will explain onesample ztest. We use ztest when we want to make inference on a population mean when the population standard deviation is known. The setup for this test is very similar to that of the onesample ttest. Below let’s quickly list out the different elements of a hypothesis test.
3. Elements of a Hypothesis Test
In general, a hypothesis test procedure consists of the following items:

Null ( ) and alternative ( ) hypotheses.

Assumptions to follow.

Test statistic.

Rejection rule.

Conclusion.
For more details of each item, readers can refer to our onesample ttest article. Below gives a description of some of the items tailored to the onesample ztest.
4. Null and Alternative Hypotheses
The null hypothesis assumes that the population mean equals to the status quo. The alternative hypothesis varies depending on what we want to test:

: the population mean is larger than the status quo.

: the population mean is smaller than the status quo.

: the population mean does not equal the status quo.
Let’s introduce some notation and summarize the statements above in the table:

 population mean

 status quo
Test type
Reject if
≤
≥
Onetailed
Uppertailed
Lowertailed
Twotailed
<(because of negative values)
 ≥ 
Table 1: Hypotheses Summary
5. Data Requirements
Just like the ttest, we require to have some assumptions on the data in order to use the ztest. Here is a list of requirements we should have on our data:

The variables should have values in a continuous range.

The sample size is large (the convention rule is larger than 30).

Sample values are taken independently.

Population variance is provided.

If your sample size is small, we can still use the ztest, provided that the population variance is known and the population distribution is Normal.
6. Test Statistics
The last column of Table 1 describes the scenarios when we can reject the null hypothesis. The value is called the test statistics, while the is called the critical value. The test statistics can be calculated as follows:
where
= sample mean
= population mean (status quo)
= number of observations
= population standard deviation
n
7. Critical Value
The critical value is a cutoff value obtained based on the assumed population distribution. For the onesample ztest, the critical value can be obtained using the following rules:
Table 2: Critical Value.
where Z is the standard Normal distribution, and α is the preset level of significance. In another words, the critical values can be determined by finding the suitable value such that there is α (or α/2) probability at the tail of the standard Normal distribution. For example, if α is 0.05, and we are doing an uppertailed test, then will be 1.645. However, if we are to do a twotailed test, the test statistics then becomes 1.96. This value can be found using a ztable (see below), or any statistical computing software (see R code).
8. Full Procedure
We have just described all the details of a onesample ztest. Let’s do a quick summary of the entire procedure:

Determine α, the level of significance.

Define the null ( ) and alternative ( ) hypotheses.

Calculate the test statistics.

Calculate the critical value.

Compare the test statistics with the critical value.

Make a decision and conclusion based on the comparison.
9. Example
Let’s work out a full example. At the beginning of this post we revisited the story of a bus rider. Suppose that the bus company provides the standard deviation of bus arrivals, which turns out to be 2.6 minutes. Using this information (plus the data from the beginning), let’s carry out the hypothesis test. Set α to be 0.05. Let’s define some notation. Let µ be the population arrival time, and be the hypothesized arrival time, which is 12 minutes as claimed by the bus company. You want to show the actual waiting time should be longer, so you are doing an uppertail hypothesis test.
: = 12
: > 12
Test statistics:
With an uppertailed test and α = 0.05, is 1.645. Now looking at Table 1, since our test statistics is larger than the critical value, we can reject the null hypothesis. This says that we have enough evidence to conclude the arrival time is indeed more than 12 minutes.
10. Pvalue
Another way to draw the conclusion of a hypothesis test is to use the pvalue. In brief, pvalue is the probability of seeing a test statistics as extreme as the one we observed, given that we assume the null hypothesis is true. In other words, if the pvalue is small, i.e. the probability is small, we can conclude that the null hypothesis is incorrect because, even with a small probability, we are still able to obtain a data set that deviates far from the original value. Hence, we will reject the null hypothesis if the pvalue is less than α. The following lists the ways to calculate the pvalues for the ztest:
Table 3: Pvalue Calculation.
Again Z is the standard Normal distribution. Notice in the calculation for the twotailed test, we multiply the probability by 2. This is because in a twotailed test, we are not specifying which direction we are looking at, hence we multiply the result by 2 to count both left and right sides. Going back to the example above, the pvalue will be Pr(Z > 4.08) < 0.0001, which will lead us to rejecting the null hypothesis as well.
11. Finding Pvalue and Critical Value using Table
There are three components to this table. 1) The image on top 2) the numbers on the sides and 3) the decimal numbers inside the table. The image at the top of the table tells us what do probabilities inside the table refer to. The numbers on the side are the zscores, and the probabilities inside the table are the probability towards the left side of each corresponding zscore (for this particular table). For example, if the zscore is 1.45, then we can get P(Z < 1.45) = 0.9265, by going down to the row with 1.4 on the left margin, and column with 0.05 on top. For our example, since we are doing a righttail test, we will need to subtract the probabilities from this table by 1 to obtain the correct pvalue.
Finding the critical value requires a ’reverse operation’. Instead of going from the outside margins towards the inside to obtain probabilities, we will go from the inside towards the outside for the critical value. This is because we can think of the α as a probability. For our righttail test, we set α = 0.05, hence we want such that P(Z > ) = 0.05. Since the table only gives probability towards the left of the zscore we will first subtract α from 1 (which gives 0.95), and find the closest probability to 0.95 inside the table. In many cases we will not be able to find the exact probability, and in our case, the closest probabilities inside the table are 0.9495 and 0.9505. Looking towards the margin, we see that the two zscores corresponding to these two probabilities are 1.64 and 1.65. Since 0.95 is between 0.9495 and 0.9505, we know that the critical value is also between 1.64 and 1.65. By convention, we simply take the average of these two values: = 1.645.
Note that different ztable may have a different setting: some table may give probabilities from 0 to the zscore, i.e. P(0 < z < Z). Always refer to the image given.
12. R Code
The R code for this can be found on the GitHub repository via here .