Nonparametric Methods

Nonparametric Methods

 

in mathematical statistics, methods of directly estimating a theoretical probability distribution and various general properties of the distribution, such as the symmetry of the distribution, based on results of observations. The term “nonparametric methods” emphasizes that such methods differ from classical (parametric) methods; in classical methods, it is assumed that (1) an unknown theoretical distribution belongs to some family that depends on a finite number of parameters (for example, a family of normal distributions), (2) unknown values of these parameters can be estimated according to results of observations, and (3) various hypotheses about these unknown values can be tested. The development of nonparametric methods was due, in large part, to Soviet scientists.

As an example of a nonparametric method we may take the method obtained by A. N. Kolmogorov for verifying the agreement of theoretical and empirical distributions (called the Kolmogorov test). Suppose the results of n independent observations of some quantity that has distribution function F (x) are obtained. Let Fn(x) denote the empirical distribution function constructed in terms of these n observations and let Dn denote the largest absolute value of the difference Fn(x) —IF (x). If F (x) is continuous, then the random variable Nonparametric Methods has distribution function Kn (λ) (X), which is independent of F (x) and approaches the limit

as n increases infinitely. Hence, the approximation

(*) Pn≈ 1 - K (Λ)

is obtained for sufficiently large n, where pn is the probability of the inequality Nonparametric Methods. The function K (λ) has been tabulated. Its values for some λ are presented in Table 1.

Table 1. Values of K (ξ)
λ .............0.570.710.831.021.361.63
K (ξ) ............0.100.300.500.750. 950.99

The approximation (*) is used in the following way to verify the hypothesis that an observed random variable has distribution function F (x). First we find the value of the variable Dn in terms of results of observations. Then, using equation (*), we calculate the probability of obtaining a deviation of Fn from F that is greater than or equal to the observed value. If this probability is sufficiently low, the hypothesis is rejected, in accordance with general principles for testing statistical hypotheses. Otherwise, it is assumed that the results of a trial do not contradict the hypothesis. The hypothesis as to whether two independent samples of size n1 and n2, respectively, are obtained from the same

general set with a continuous distribution law is similarly verified. In this case, the approximation (*) is replaced by the statement that the probability of the inequality

has limit k (λ), which was established by N. V. Smirnov; here,

Dn1,n2

is the largest absolute value of the difference

Fn1(x) - Fn2(x)

Methods of testing the hypothesis that a theoretical distribution belongs to a family of normal distributions constitute a second example of nonparametric methods. We note here only

Figure 1

one of these methods, the normal plot. This method is based on the following idea. If the random variable X has a normal distribution with parameters α and σ, then

where Φ-1 is the function inverse to the normal distribution

Thus, in this case, the graph of the function y = Ф1[F (x)] will be a straight line, while the graph of y = Ф1[F (x)] will be a jagged line, close to the straight line (see Figure 1). The goodness of fit then serves as a criterion for verifying the hypothesis that the distribution F (x) is a normal distribution.

REFERENCES

Smirnov, N. V., and I. V. Dunin-Barkovskii. Kurs teorii veroiatnostei i matematicheskoi statistiki dlia tekhnicheskikh prilozhenii, 3rd ed. Moscow, 1969.
Bol’shev, L. N., and N. V. Smirnov. Tablitsy matematicheskoi statistiki. Moscow, 1968.

IU. V. PROKHOROV