The definition of mean is, "an average of n numbers computed by adding some function of the numbers and dividing by some function of n."
The central tendency of a set of measurement results is typically found by calculating the arithmetic mean ( ) and less commonly the median or geometric mean. The mean is an
estimate of the true value as long as there is no systematic error. In the absence of
systematic error, the mean approaches the true value (µ) as the number of
measurements (n) increases. The frequency distribution of the measurements approximates a bell-shaped curve that is symmetrical around the mean. The
arithmetic mean is calculated using the following equation:
$$\overline X = {(X_1 + X_2 + ···X_n ) \over n} \tag{1}$$
Typically, insufficient data are collected to determine if the data are evenly distributed. Most analysts rely upon quality control data obtained along with the sample data to indicate the accuracy of the procedural execution, i.e., the absence of systematic error(s). The analysis of at least one QC sample with the unknown
sample(s) is strongly recommended.
Even when the QC sample is in control it is still important to inspect the data for outliers. There is a third type of error typically referred to as a 'blunder'. This is an
error that is made unintentionally. A blunder does not fall in the systematic or random error categories. It is a mistake that went unnoticed, such as a transcription
error or a spilled solution.
For limited data sets (n = 3 to 10), the range ($X_n-X_1$), where $X_n$
is the largest value and $X_1$ is the smallest value, is a good estimate of
the precision and a useful value in data inspection. In the situation where a limited data set has a suspicious outlier and the QC sample is in control, the analyst
should calculate the range of the data and determine if it is significantly larger than would be expected based upon the QC data. If an explanation cannot be found
for an outlier (other than it appears too high or low), there is a convenient test that can be used for the rejection of possible outliers from limited data sets. This is
the Q test.
The Q test is commonly conducted at the $90%$ confidence level but the following table (1) includes the $96%$ and $99%$ levels as well for your convenience. At the
$90%$ confidence level, the analyst can reject a result with $90%$ confidence that an outlier is significantly different from the other results in the data set. The Q test
involves dividing the difference between the outlier and it's nearest value in the set by the range, which gives a quotient - Q. The range is always calculated by
including the outlier, which is automatically the largest or smallest value in the data set. If the quotient is greater than the refection quotient, $Q_{0.90}$, then the
outlier can be rejected.
Table. 1: The Q Test
3 | 0.94 | 0.98 | 0.99
|
4 | 0.76 | 0.85 | 0.93
|
5 | 0.64 | 0.73 | 0.82
|
6 | 0.56 | 0.64 | 0.74
|
Example 1: This example will test four results in a data set--1004, 1005, 1001, and 981.
Solution:
- The range is calculated: 1005 - 981 = 24.
- The difference between the questionable result (981) and its nearest neighbor is calculated: 1001 - 981 = 20.
- The quotient is calculated: 20/24 = 0.83.
- The calculated quotient is compared to the Q0.90 value of 0.76 for n=4 and found to be greater.
- The questionable result (981) is rejected.
Do you want to say or ask something?