14.3: Validating the Method as a Standard Method

Last updated
Save as PDF

Page ID: 5583

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

For an analytical method to be useful, an analyst must be able to achieve results of acceptable accuracy and precision. Verifying a method, as described in the previous section, establishes this goal for a single analyst. Another requirement for a useful analytical method is that an analyst should obtain the same result from day to day, and different labs should obtain the same result when analyzing the same sample. The process by which we approve method for general use is known as validation and it involves a collaborative test of the method by analysts in several laboratories. Collaborative testing is used routinely by regulatory agencies and professional organizations, such as the U. S. Environmental Protection Agency, the American Society for Testing and Materials, the Association of Official Analytical Chemists, and the American Public Health Association. Many of the representative methods in earlier chapters are identified by these agencies as validated methods.

Representative Method 10.1 for the determination of iron in water and wastewater, and Representative Method 10.5 for the determination of sulfate in water, are two examples of standard methods validated through collaborative testing.

When an analyst performs a single analysis on a sample the difference between the experimentally determined value and the expected value is influenced by three sources of error: random errors, systematic errors inherent to the method, and systematic errors unique to the analyst. If the analyst performs enough replicate analyses, then we can plot a distribution of results, as shown in Figure 14.18a. The width of this distribution is described by a standard deviation, providing an estimate of the random errors effecting the analysis. The position of the distribution’s mean, X, relative to the sample’s true value, μ, is determined both by systematic errors inherent to the method and those systematic errors unique to the analyst. For a single analyst there is no way to separate the total systematic error into its component parts.

The goal of a collaborative test is to determine the magnitude of all three sources of error. If several analysts each analyze the same sample one time, the variation in their collective results, as shown in Figure 14.18b, includes contributions from random errors and those systematic errors (biases) unique to the analysts. Without additional information, we cannot separate the standard deviation for this pooled data into the precision of the analysis and the systematic errors introduced by the analysts. We can use the position of the distribution, to detect the presence of a systematic error in the method.

Figure14.18B.jpg

Figure 14.18 Partitioning of random errors, systematic errors due to the analyst, and systematic errors due to the method for (a) replicate analyses performed by a single analyst, and (b) single determinations performed by several analysts.

14.3.1 Two-Sample Collaborative Testing

The design of a collaborative test must provide the additional information we need to separate random errors from the systematic errors introduced by the analysts. One simple approach—accepted by the Association of Official Analytical Chemists—is to have each analyst analyze two samples that are similar in both their matrix and in their concentration of analyte. To analyze their results we represent each analyst as a single point on a two-sample chart, using the result for one sample as the x-coordinate and the result for the other sample as the y-coordinate.⁸

As shown in Figure 14.19, a two-sample chart divides the results into four quadrants, which we identify as (+, +), (–, +), (–, –) and (+, –), where a plus sign indicates that the analyst’s result exceeds the mean for all analysts and a minus sign indicates that the analyst’s result is smaller than the mean for all analysts. The quadrant (+, –), for example, contains results for analysts that exceeded the mean for sample X and that undershot the mean for sample Y. If the variation in results is dominated by random errors, then we expect the points to be distributed randomly in all four quadrants, with an equal number of points in each quadrant. Furthermore, as shown in Figure 14.19a, the points will cluster in a circular pattern whose center is the mean values for the two samples. When systematic errors are significantly larger than random errors, then the points occur primarily in the (+, +) and the (–, –) quadrants, forming an elliptical pattern around a line bisecting these quadrants at a 45^o angle, as seen in Figure 14.19b.

Figure 14.19 Typical two-sample plots when (a) random errors are significantly larger than systematic errors due to the analysts, and (b) when systematic errors due to the analysts are significantly larger than the random errors.

A visual inspection of a two-sample chart is an effective method for qualitatively evaluating the results of analysts and the capabilities of a proposed standard method. If random errors are insignificant, then the points fall on the 45^o line. The length of a perpendicular line from any point to the 45^o line, therefore, is proportional to the effect of random error on that analyst’s results. The distance from the intersection of the axes—corresponding to the mean values for samples X and Y—to the perpendicular projection of a point on the 45^o line is proportional to the analyst’s systematic error. Figure 14.20 illustrates these relationships. An ideal standard method has small random errors and small systematic errors due to the analysts, and has a compact clustering of points that is more circular than elliptical.

Figure 14.20 Relationship between the result for a single analyst (in blue) and the contribution of random error (red arrow) and the contribution from the analyst’s systematic error (green arrow).

We also can use the data in a two-sample chart to separate the total variation in the data, σ_tot, into contributions from random error, σ_rand, and systematic errors due to the analysts, σ_syst.⁹ Because an analyst’s systematic errors are present in his or her analysis of both samples, the difference, D, between the results

\[D_i = X_i - Y_i\]

is the result of random error. To estimate the total contribution from random error we use the standard deviation of these differences, s_D, for all analysts

\[s_\textrm D=\sqrt{\dfrac{\sum (D_i-\overline D)^2}{2(n-1)}}=s_{\textrm{rand}}\approx \sigma_{\textrm{rand}}\tag{14.18}\]

where n is the number of analysts. The factor of 2 in the denominator of equation 14.18 is the result of using two values to determine D_i. The total, T, of each analyst’s results

\[T_i = X_i + Y_i\]

contains contributions from both random error and twice the analyst’s systematic error.

\[\sigma_{\textrm{tot}}^2=\sigma_{\textrm{rand}}^2+2\sigma_{\textrm{syst}}^2\tag{14.19}\]

We double the analyst’s systematic error in equation 14.19 because it is the same in each analysis.

The standard deviation of the totals, s_T, provides an estimate for σ_tot.

\[s_\textrm T=\sqrt {\dfrac{\sum (T_i-\overline T)^2}{2(n-1)}}=s_{\textrm{tot}}\approx \sigma_{\textrm{tot}}\tag{14.20}\]

Again, the factor of 2 in the denominator is the result of using two values to determine T_i.

If the systematic errors are significantly larger than the random errors, then s_T is larger than s_D, a hypothesis we can evaluate using a one-tailed F‑test

\[F=\dfrac{s_\textrm T^2}{s_\textrm D^2}\]

where the degrees of freedom for both the numerator and the denominator are n – 1. As shown in the following example, if s_T is significantly larger than s_D we can use equation 14.19 to separate σ_tot² into components representing random error and systematic error.

For a review of the F-test, see Section 4.6.2 and Section 4.6.3. Example 4.18 illustrates a typical application.

Example 14.6

As part of a collaborative study of a new method for determining the amount of total cholesterol in blood, you send two samples to 10 analysts with instructions to analyze each sample one time. The following results, in mg total cholesterol per 100 mL of serum, are returned to you.

analyst	sample 1	sample 2
1	245.0	229.4
2	247.4	249.7
3	246.0	240.4
4	244.9	235.5
5	255.7	261.7
6	248.0	239.4
7	249.2	255.5
8	255.1	224.3
9	255.0	246.3
10	243.1	253.1

Use this data estimate σ_rand and σ_syst for the method.

Solution

Figure 14.21 provides a two-sample plot of the results. The clustering of points suggests that the systematic errors of the analysts are significant. The vertical line at 245.9 mg/100 mL is the average value for sample 1 and the average value for sample 2 is shown by the horizontal line at 243.5 mg/100 mL. To estimate σ_rand and σ_syst we first calculate values for D_i and T_i.

analyst	*D_i*	*T_i*
1	15.6	474.4
2	-2.3	497.1
3	5.6	486.4
4	9.4	480.4
5	-6.0	517.4
6	8.6	487.4
7	-6.3	504.7
8	0.8	449.4
9	8.7	501.3
10	-10.0	496.2

Next, we calculate the standard deviations for the differences, s_D, and the totals, s_T, using equations 14.18 and 14.20, giving s_D = 5.95 and s_T = 13.3. To determine if the systematic errors between the analysts are significant, we use an F-test to compare s_T and s_D.

\[F=\dfrac{s_\textrm T^2}{s_\textrm D^2}=\dfrac{(13.3)^2}{(5.95)^2}=5.00\]

Because the F-ratio is larger than F(0.05, 9, 9), which is 3.179, we conclude that the systematic errors between the analysts are significant at the 95% confidence level. The estimated precision for a single analyst is

\[\sigma_{\textrm{rand}}\approx s_\textrm D=5.95\]

(Critical values for the F-test are in Appendix 5.)

The estimated standard deviation due to systematic errors between analysts is calculated from equation 14.18.

\[\sigma_{\textrm{syst}}\approx \sqrt{\dfrac{\sigma_{\textrm{tot}}^2-\sigma_{\textrm{rand}}^2}{2}}\approx \sqrt{\dfrac{s_\textrm T^2-s_\textrm D^2}{2}}=\sqrt{\dfrac{(13.3)^2-(5.95)^2}{2}}=8.41\]

Figure 14.21 Two-sample plot for the data in Example 14.6. The number by each blue point indicates the analyst. The true values for each sample (see Example 14.7) are indicated by the red star.

If the true values for the two samples are known, we also can test for the presence of a systematic error in the method. If there are no systematic method errors, then the sum of the true values, μ_tot, for samples X and Y

\[\mu_\ce{tot} = \mu_\ce{X} + \mu_\ce{Y}\]

should fall within the confidence interval around T. We can use a two-tailed t-test of the following null and alternate hypotheses

\[H_0:\overline{T}= \mu_\ce{tot} \hspace{30px} H_\ce{A}: \overline{T}≠ \mu_\ce{tot}\]

to determine if there is evidence for a systematic error in the method. The test statistic, t_exp, is

\[t_{\textrm{exp}}=\dfrac{|\overline T-\,\mu_{\textrm{tot}}|\sqrt n}{s_\textrm T\sqrt2}\tag{14.21}\]

with n – 1 degrees of freedom. We include the √2 in the denominator because s_T (see equation 14.20) underestimates the standard deviation when comparing T to μ_tot.

For a review of the t-test of an experimental mean to a known mean, see Section 4.6.1. Example 4.16 illustrates a typical application.

Example 14.7

The two samples analyzed in Example 14.6 are known to contain the following concentrations of cholesterol

\[\mathrm{\mu_{samp\: 1} = 248.3\: mg/100\: mL \hspace{30px} \mu_{samp\: 2} = 247.6\: mg/100\: mL}\]

Determine if there is any evidence for a systematic error in the method at the 95% confidence level.

Solution

Using the data from Example 14.6 and the true values for the samples, we know that s_T is 13.3, and that

\[\overline{T} = \overline{X}_\textrm{samp 1} + \overline{X}_\textrm{samp 2} = 245.9 + 243.5 = \textrm{489.4 mg/100 mL}\]

\[\mu_\textrm{tot} = \mu_\textrm{samp 1} + \mu_\textrm{samp 2} = 248.3 + 247.6 = \textrm{495.9 mg/100 mL}\]

Substituting these values into equation 14.21 gives

\[t_{\textrm{exp}}=\dfrac{|489.4-495.9|\sqrt{10}}{13.3\sqrt2}=1.09\]

Because this value for t_exp is smaller than the critical value of 2.26 for t(0.05, 9), there is no evidence for a systematic error in the method at the 95% confidence level.

(Critical values for the t-test are in Appendix 4.)

Example 14.6 and Example 14.7 illustrate how we can use a pair of similar samples in a collaborative test of a new method. Ideally, a collaborative test involves several pairs of samples that span the range of analyte concentrations for which we plan to use the method. In doing so, we evaluate the method for constant sources of error and establish the expected relative standard deviation and bias for different levels of analyte.

14.3.2 Collaborative Testing and Analysis of Variance

In a two-sample collaborative test we ask each analyst to perform a single determination on each of two separate samples. After reducing the data to a set of differences, D, and a set of totals, T, each characterized by a mean and a standard deviation, we extract values for the random errors affecting precision and the systematic differences between then analysts. The calculations are relatively simple and straightforward.

An alternative approach for completing a collaborative test is to have each analyst perform several replicate determinations on a single, common sample. This approach generates a separate data set for each analyst, requiring a different statistical treatment to arrive at estimates for σ_rand and σ_syst.

There are several statistical methods for comparing three or more sets of data. The approach we consider in this section is an analysis of variance (ANOVA). In its simplest form, a one-way ANOVA allows us to explore the importance of a single variable—the identity of the analyst is one example—on the total variance. To evaluate the importance of this variable, we compare its variance to the variance explained by indeterminate sources of error.

We first introduced variance in Chapter 4 as one measure of a data set’s spread around its central tendency. In the context of an analysis of variance, it is useful for us to understand that variance is simply a ratio of two terms: a sum of squares for the differences between individual values and their mean, and the available degrees of freedom. For example, the variance, s², of a data set consisting of n measurements is

\[s^2=\dfrac{\sum (X_i-\overline X)^2}{n-1}=\dfrac{\textrm{sum of squares}}{\textrm{degrees of freedom}}\]

where X_i is the value of a single measurement and X is the mean. The ability to partition the variance into a sum of squares and the degrees of freedom greatly simplifies the calculations in a one-way ANOVA.

Let’s use a simple example to develop the rationale behind a one-way ANOVA calculation. The data in Table 14.6 are from four analysts, each asked to determine the purity of a single pharmaceutical preparation of sulfanilamide. Each column in Table 14.6 provides the results for an individual analyst. To help us keep track of this data, we will represent each result as X_ij, where i identifies the analyst and j indicates the replicate. For example, X₃_,5 is the fifth replicate for the third analyst, or 94.24%.

Table 14.6 Determination of the %Purity of a Sulfanilamide Preparation by Four Analysts
replicate	analyst A	analyst B	analyst C	analyst D
1	94.09	99.55	95.14	93.88
2	94.64	98.24	94.62	94.23
3	95.08	101.1	95.28	96.05
4	94.54	100.4	94.59	93.89
5	95.38	100.1	94.24	94.95
6	93.62			95.49
X	94.56	99.88	95.86	94.77
s	0.630	1.073	0.428	0.899

The data in Table 14.6 show variability, both in the results obtained by each analyst and in the difference in the results between the analysts. There are two sources for this variability: indeterminate errors associated with the analytical procedure experienced equally by each analyst, and systematic or determinate errors introduced by individual analysts.

One way to view the data in Table 14.6 is to treat it as a single large sample, characterized by a global mean and a global variance

\[\overline{\overline X}=\dfrac{\sum_{i=1}^{h}\sum_{j=1}^{n_i}X_{ij}}{N}\tag{14.22}\]

\[\overline{\overline{s^2}}=\dfrac{\sum_{i=1}^{h}\sum_{j=1}^{n_i}(X_{ij}-\overline{\overline X})^2}{N-1}\tag{14.23}\]

where h is the total number of samples (in this case the number of analysts), n_i is the number of replicates for the ith sample (in this case the ith analyst), and N is the total number of data points (in this case 22). The global variance—which includes all sources of variability affecting the data—provides an estimate of the combined influence of indeterminate errors and systematic errors.

A second way to work with the data in Table 14.6 is to treat the results for each analyst separately. If we assume that each analyst experiences the same indeterminate errors, then the variance, s², for each analyst provides a separate estimate of σ_rand². To pool these individual variances, which we call the within-sample variance, s_w², we square the difference between each replicate and its corresponding mean, add them up, and divide by the degrees of freedom.

\[\sigma _{\textrm{rand}}^2\approx s_\textrm w^2=\dfrac{\sum_{i=1}^{h}\sum_{j=1}^{n_i}( X_{ij}-\overline X_i)^2}{N-h}\tag{14.24}\]

Note

Carefully compare our description of equation 14.24 to the equation itself. It is important that you understand why equation 14.24 provides our best estimate of the indeterminate errors affecting the data in Table 14.6. Note that we lose one degree of freedom for each of the h means included in the calculation.

Equation 14.24 provides an estimate for σ_rand². To estimate the systematic errors, σ_syst², affecting the results in Table 14.6 we need to consider the differences between the analysts. The variance of the individual mean values about the global mean, which we call the between-sample variance, s_b², provides this estimate.

\[\sigma _{\textrm{syst}}^2\approx s_\textrm b^2=\dfrac{\sum_{i=1}^{h}n_i(\overline X_i-\overline{\overline{X}})^2}{h-1}\tag{14.25}\]

Note

We lose one degree of freedom for the global mean.

The between-sample variance includes contributions from both indeterminate errors and systematic errors

\[s_\ce{b}^2= σ_\ce{rand}^2 + \bar{n}σ_\ce{syst}^2\tag{14.26}\]

where n is the average number of replicates per analyst.

\[\bar n = \dfrac{\sum_{i=1}^{h}n_i}{h}\]

Note

Note the similarity between equation 14.26 and equation 14.19. The analysis of the data in a two-sample plot is the same as a one-way analysis of variance with h = 2.

In a one-way ANOVA of the data in Table 14.6 we make the null hypothesis that there are no significant differences between the mean values for each analyst. The alternative hypothesis is that at least one of the means is significantly different. If the null hypothesis is true, then σ_syst² must be zero, and s_w² and s_b² should have similar values. If s_b² is significantly greater than s_w², then σ_syst² is greater than zero. In this case we must accept the alternative hypothesis that there is a significant difference between the means for the analysts. The test statistic is the F-ratio

\[F_{\textrm{exp}}=\dfrac{s_\ce{b}^2}{s_\ce{w}^2}\]

which is compared to the critical value F(α, h – 1, N – h). This is a one-tailed significance test because we are only interested in whether s_b² is significantly greater than s_w².

Both s_b² and s_w² are easy to calculate for small data sets. For larger data sets, calculating s_w² is tedious. We can simplify the calculations by taking advantage of the relationship between the sum-of-squares terms for the global variance (equation 14.23), the within-sample variance (equation 14.24), and the between-sample variance (equation 14.25). We can split the numerator of equation 14.23, which is the total sum-of-squares, SS_t, into two terms

\[SS_\ce{t} = SS_\ce{w} + SS_\ce{b}\]

where SS_w is the sum-of-squares for the within-sample variance and SS_b is the sum-of-squares for the between-sample variance. Calculating SS_t and SS_b gives SS_w by difference. Finally, dividing SS_w and SS_b by their respective degrees of freedom gives s_w² and s_b². Table 14.7 summarizes the equations for a one-way ANOVA calculation. Example 14.8 walks you through the calculations, using the data in Table 14.6. Section 14D provides instructions on using Excel and R to complete a one-way analysis of variance.

Note

Problem 14.17 in the end of chapter problems asks you to verify this relationship between the sum-of-squares.

Table 14.7 Summary of Calculations for a One-Way Analysis of Variance
source	sum-of-squares	degrees of freedom	variance	expected variance	F-ratio
between samples	\(SS_\textrm b = \sum_{i=1}^{h}n_i(\overline X_i-\overline{\overline{X}})^2\)	\(h - 1\)	\(s_\textrm b^2=\dfrac{SS_\textrm b}{h-1}\)	\(s_\textrm b^2=\sigma _{\textrm{rand}}^2+\bar n\sigma_{\textrm{syst}}^2\)	\(F_{\textrm{exp}}=\dfrac{s_\textrm b^2}{s_\textrm w^2}\)
within samples	\(SS_\ce{t} = SS_\ce{w} + SS_\ce{b}\)	\(N - h\)	\(s_\textrm w^2=\dfrac{SS_\textrm w}{N-h}\)	\(s_\ce{w}^2 = σ_\ce{rand}^2\)
total	\(\begin{align} SS_\textrm t &= \sum_{i=1}^{h}\sum_{j=1}^{n_i}(X_{ij}-\overline{\overline{X}})^2\\ &=\overline{\overline{s^2}}(N-1) \end{align}\)	\(N - 1\)

Example 14.8

The data in Table 14.6 are from four analysts, each asked to determine the purity of a single pharmaceutical preparation of sulfanilamide. Determine if the difference in their results is significant at α = 0.05. If such a difference exists, estimate values for σ_rand² and σ_syst².

Solution

To begin we calculate the global mean (equation 14.22) and the global variance (equation 14.23) for the pooled data, and the means for each analyst; these values are summarized here.

\[\overline{\overline X}=95.87 \hspace{5 mm}\overline{\overline {s^2}}=5.506\]

\[\overline X_\textrm A=94.56 \hspace{5 mm}\overline X_\textrm B=99.88\hspace{5 mm}\overline X_\textrm C=94.77\hspace{5 mm}\overline X_\textrm S=94.75\]

Using these values we calculate the total sum of squares

\[SS_\textrm t=\overline{\overline{s^2}}(N-1)=(5.506)(22-1)\]

the between sample sum of squares

\[SS_\textrm b=\sum_{i=1}^{h}n_i(\overline X_i-\overline{\overline X})=6(94.56-95.87)^2+5(99.88-95.87)^2 +5(94.77-95.87)^2+6(94.75-95.87)^2=104.27\]

and the within sample sum of squares

\[SS_\ce{w} = SS_\ce{t} - SS_\ce{b} = 115.63 - 104.27 = 11.36\]

The remainder of the necessary calculations are summarized in the following table.

source	sum-of-squares	degrees of freedom	variance
between samples	104.27	h – 1 = 4 – 1 = 3	34.76
within samples	11.36	N – h = 22 – 4 = 8	0.631

Comparing the variances we find that

\[F_{\textrm{exp}}=\dfrac{s_\textrm b^2}{s_\textrm w^2}=\dfrac{34.76}{0.631}=55.08\]

Because F_exp is greater than F(0.05, 3, 18), which is 3.16, we reject the null hypothesis and accept the alternative hypothesis that the work of at least one analyst is significantly different from the remaining analysts. Our best estimate of the within sample variance is

\[\sigma_{\textrm{rand}}^2\approx s_\textrm w^2=0.631\]

and our best estimate of the between sample variance is

\[\sigma_{\textrm{syst}}^2\approx \dfrac{s_\textrm b^2-s_\textrm w^2}{\bar n}=\dfrac {34.76-0.631}{22/4}=6.205\]

In this example the variance due to systematic differences between the analysts is almost an order of magnitude greater than the variance due to the method’s precision.

Having demonstrated that there is significant difference between the analysts, we can use a modified version of the t-test, known as Fisher’s least significant difference, to determine which analyst or analysts are responsible for the difference. The test statistic for comparing two mean values is the t-test given in equation 4.21 in Chapter 4, except we replace the pooled standard deviation, s_pool, by the square root of the within-sample variance from the analysis of variance.

\[t_{\textrm{exp}}=\dfrac{|\overline X_1-\overline X_2|}{\sqrt {s_\textrm w^2}\sqrt{\dfrac{1}{n_\textrm A}+\dfrac{1}{n_\textrm B}}}=\dfrac{|\overline X_1-\overline X_2|}{\sqrt{s_\textrm w^2}}\times \sqrt{\dfrac{n_\textrm An_\textrm B}{n_\textrm A+n_\textrm B}}\tag{14.27}\]

We compare t_exp to its critical value t(α, ν) using the same significance level as the ANOVA calculation. The degrees of freedom are the same as that for the within sample variance. Since we are interested in whether the larger of the two means is significantly greater than the other mean, the value of t(α, ν) is that for a one-tailed significance test.

You might ask why we bother with the analysis of variance if we are planning to use a t-test to compare pairs of analysts. Each t-test carries a probability, α, of claiming that a difference is significant even though it is not (a type 1 error). If we set α to 0.05 and complete six t-tests, the probability of a type 1 error increases to 0.265. Knowing that there is a significant difference within a data set—what we gain from the analysis of variance—protects the t-test.

Example 14.9

In Example 14.8 we showed that there is a significant difference between the work of the four analysts in Table 14.6. Determine the source of this significant difference.

Solution

Individual comparisons using Fisher’s least significant difference test are based on the following null hypothesis and the appropriate one-tailed alternative hypothesis.

\[H_0: \overline{X}_i =\overline{X}_j \hspace{30px} H_\ce{A}:\textrm{ one of the following }\overline{X}_i > \overline{X}_j\textrm{ or }\overline{X}_i < \overline{X}_j\]

Using equation 14.27 we calculate values of t_exp for each possible comparison and compare them to the one-tailed critical value of 1.73 for t(0.05, 18). For example, t_exp for analysts A and B is

\[ (t_{\textrm{exp}})_{\textrm{A,B}}=\dfrac{|\overline X_1-\overline X_2|}{\sqrt{s_\textrm w^2}}\times \sqrt{\dfrac{n_\textrm An_\textrm B}{n_\textrm A+n_\textrm B}}=\dfrac{|94.56-99.88|}{\sqrt{0.631}}\times \sqrt{\dfrac{6 \times 5}{6+5}}=11.06\]

Because (t_exp)_A,B is greater than t(0.05, 18) we reject the null hypothesis and accept the alternative hypothesis that the results for analyst B are significantly greater than those for analyst A. Continuing with the other pairs it is easy to show that (t_exp)_A,C is 0.437, (t_exp)_A,D is 0.414, (t_exp)_B,C is 10.17, (t_exp)_B,D is 10.67, and (t_exp)_C,D is 0.04. Collectively, these results suggest that there is a significant systematic difference between the work of analyst B and the work of the other analysts. There is no way to decide whether any of the four analysts has done accurate work.

(We know that analyst B’s result is significantly different than the results for analysts A, C, and D, and that we have no evidence that there is any significant difference between the results of analysts A, C, and D. We do not know if analyst B’s results is accurate, or if the results of analysts A, C, and D are accurate. In fact, it is possible that none of the results in Table 14.6 are accurate.)

We can extend an analysis of variance to systems involving more than a single variable. For example, we can use a two-way ANOVA to determine the effect on an analytical method of both the analyst and the instrumentation. The treatment of multivariate ANOVA is beyond the scope of this text, but is covered in several of the texts listed in this chapter’s additional resources.

14.3.3 What is a Reasonable Result for a Collaborative Study?

Collaborative testing provides us with a method for estimating the variability (or reproducibility) between analysts in different labs. If the variability is significant, we can determine what portion is due to indeterminate method errors (σ_rand²) and what portion is due to systematic differences between the analysts (σ_syst²). What we have left unanswered is the following important question: What is a reasonable value for a method’s reproducibility?

An analysis of nearly 10 000 collaborative studies suggests that a reasonable estimate for a method’s reproducibility is

\[R=2^{(1-0.5\log C)}\tag{14.28}\]

where R is the percent relative standard deviation for the results included in the collaborative study and C is the fractional amount of analyte in the sample on a weight-to-weight basis.¹⁰ Equation 14.28 appears to be independent of the type of analyte, the type of matrix, and the method of analysis. For example, when a sample in a collaborative study contains 1 microgram of analyte per gram of sample, C is 10^–6 and the estimated relative standard deviation is

\[R=2^{(1-0.5\log 10^{-6})}=16\%\]

For a discussion of the limitations of equation 14.28, see Linsinger, T. P. J.; Josephs, R. D. “Limitations of the Application of the Horwitz Equation,” Trends Anal. Chem. 2006, 25, 1125–1130, as well as a rebuttal (Thompson, M. “Limitations of the Application of the Horwitz Equation: A Rebuttal,” Trends Anal. Chem. 2007, 26, 659–661) and response (Linsinger, T. P. J.; Josephs, R. D. “Reply to Professor Michael Thompson’s Rebuttal,” Trends Anal. Chem. 2007, 26, 662–663.

Example 14.10

What is the estimated relative standard deviation for the results of a collaborative study when the sample is pure analyte (100% w/w analyte)? Repeat for the case where the analyte’s concentration is 0.1% w/w.

Solution

When the sample is 100% w/w analyte (C = 1) the estimated relative standard deviation is

\[R=2^{(1-0.5\log 1)}=2\%\]

We expect that approximately 67% of the participants in the collaborative study (±1σ) will report the analyte’s concentration within the range of 98% w/w to 102% w/w. If the analyte’s concentration is 0.1% w/w (C = 0.001), the estimated relative standard deviation is

\[R=2^{(1-0.5\log 0.001)}=5.7\%\]

and we expect that 67% of the analysts will report the analyte’s concentration within the range of 0.094% w/w to 0.106% w/w.

(For a normal distribution, 68.26% of the results fall within ±1s of the population’s mean (see Table 4.12).)

Of course, equation 14.28 only estimates the expected relative standard. If the method’s relative standard deviation falls with a range of one-half to twice the estimated value, then it is acceptable for use by analysts in different laboratories. The percent relative standard deviation for a single analyst should be one-half to two-thirds of that for the variability between analysts.

Contributors

David Harvey (DePauw University)