Name
Orlov Alexander Ivanovich
Scholastic degree
•
•
•
Academic rank
professor
Honorary rank
—
Organization, job position
Bauman Moscow State Technical University
Web site url
—
Articles count: 155
In various applications it is necessary to analyze
some expert orderings, ie clustered rankings of
examination objects. These areas include technical
studies, ecology, management, economics, sociology,
forecasting, etc. The objects may make samples of
the products, technologies, mathematical models,
projects, job applicants and others. We obtain
clustered rankings which can be both with the help
of experts and objective way, for example, by
comparing the mathematical models with
experimental data using a particular quality criterion.
The method described in this article was developed
in connection with the problems of chemical safety
and environmental security of the biosphere. We
propose a new method for constructing a clustered
ranking which can be average (in the sense,
discussed in this work) for all clustered rankings
under our consideration. Then the contradictions
between the individual initial rankings are contained
within clusters average (coordinated) ranking. As a
result, ordered clusters reflects the general opinion
of the experts, more precisely, the total that is
contained simultaneously in all the original
rankings. Newly built clustered ranking is often
called the matching (coordinated) ranking with
respect to the original clustered rankings. The
clusters are enclosed objects about which some of
the initial rankings are contradictory. For these
objects is necessary to conduct the new studies.
These studies can be formal mathematics
(calculation of the Kemeny median, orderings by
means of the averages and medians of ranks, etc.) or
these studies require involvement of new information
from the relevant application area, it may be
necessary conduct additional scientific research. In
this article we introduce the necessary concepts and
we formulate the new algorithm of construct the
coordinated ranking for some cluster rankings in
general terms, and its properties are discussed
The instrumental methods of economics include the Monte Carlo method (statistical simulations method). It is widely used in the development, study and application of mathematical research methods in econometrics, applied statistics, organizational and economic modeling, in the development and making management decisions, in the basis of simulation modeling. The new paradigm of mathematical research methods developed by us is based on the use of the Monte Carlo method. In mathematical statistics, limit theorems on the asymptotic behavior of the considered random values were obtained for many methods of data analysis with an unlimited increase in sample volumes. The next step is to study the properties of these random values for finite sample sizes. For such a study, the Monte-Carlo method is used. In this article, we use this method to study the properties of statistical criteria for testing the homogeneity of two independent samples. We considered the most used in the analysis of real data criteria - Cramer-Welch, which coincides with the equality of the sample sizes with Student's criterion; Lord, Wilcoxon (Mann-Whitney), Wolfowitz, Van der Waerden, Smirnov, type omega-square (Lehmann-Rosenblatt). The Monte Carlo method allows us to estimate the rates of convergence of distributions of criteria statistics to the limits, to compare the properties of the criteria for finite sample sizes. To use the Monte Carlo method, it is necessary to select the distribution functions of the elements of the two samples. For this purpose, normal and Weibull – Gnedenko distributions are used. The recommendation was received: to test the hypothesis of coincidence of distribution functions of two samples, it is advisable to use the Lehmann-Rosenblatt (type omega-square) test. If there is reason to assume that the distributions differ mainly by the shift, then the Wilcoxon test and Van der Waerden criteria can also be used. However, even in this case, the omega-square type test may be more powerful. In the general case, besides the Lehmann-Rosenblatt criterion, the use of the Smirnov criterion is permissible, although for this criterion the real level of significance may differ from the nominal level of significance. We sstudied the frequency of discrepancies of statistical findings on different criteria
Applied Statistics - the science of how to analyze
the statistical data. As an independent scientificpractical
area it develops very quickly. It includes
numerous widely and deeply developed scientific
directions. Those who use the applied statistics and
other statistical methods, usually focused on specific
areas of study, ie, are not specialists in applied
statistics. Therefore, it is useful to make a critical
analysis of the current state of applied statistics and
discuss trends in the development of statistical
methods. Most of the practical importance of
applied statistics justifies the usefulness of the work
on the development of its methodology, in which the
field of scientific and applied activities would be
considered as a whole. We have given some brief
information about the history of applied statistics.
Based on Scientometrics of Applied Statistics we
state that each expert has only a small part of
accumulated knowledge in this area. We discuss five
topical areas in which modern applied statistics
develops, ie five "points of growth": nonparametric,
robustness, bootstrap, statistics of interval data, and
statistics of non-numerical data. We discuss some
details of the basic ideas of a non-numerical
statistics. In the last more than 60 years in Russia,
there has been a huge gap between official statistics
and the scientific community of experts on statistical
methods
Statistical control is a sampling control based on the probability theory and mathematical statistics. The article presents the development of the methods of statistical control in our country. It discussed the basics of the theory of statistical control – the plans of statistical control and their operational characteristics, the risks of the supplier and the consumer, the acceptance level of defectiveness and the rejection level of defectiveness. We have obtained the asymptotic method of synthesis of control plans based on the limit average output level of defectiveness. We have also developed the asymptotic theory of single sampling plans and formulated some unsolved mathematical problems of the theory of statistical control
Nonparametric estimates of the probability
distribution density in spaces of arbitrary nature are
one of the main tools of non-numerical statistics.
Their particular cases are considered - kernel density
estimates in spaces of arbitrary nature, histogram
estimations and Fix-Hodges-type estimates. The
purpose of this article is the completion of a series
of papers devoted to the mathematical study of the
asymptotic properties of various types of
nonparametric estimates of the probability
distribution density in spaces of general nature.
Thus, a mathematical foundation is applied to the
application of such estimates in non-numerical
statistics. We begin by considering the mean square
error of the kernel density estimate and, in order to
maximize the order of its decrease, the choice of the
kernel function and the sequence of the blur
indicators. The basic concepts are the circular
distribution function and the circular density. The
order of convergence in the general case is the same
as in estimating the density of a numerical random
variable, but the main conditions are imposed not on
the density of a random variable, but on the circular
density. Next, we consider other types of
nonparametric density estimates - histogram
estimates and Fix-Hodges-type estimates. Then we
study nonparametric regression estimates and their
application to solve discriminant analysis problems
in a general nature space
We consider an approach to the transition from
continuous to discrete scale which was defined by
means of step of quantization (i.e. interval of
grouping). Applied purpose is selecting the number
of gradations in sociological questionnaires. In
accordance with the methodology of the general
stability theory, we offer to choose a step so that the
errors, generated by the quantization, were of the
same order as the errors inherent in the answers of
respondents. At a finite length of interval of the
measured value change of the scale this step of
quantization uniquely determines the number of
gradations. It turns out that for many issues gated it
is enough to point 3 - 6 answers gradations (hints).
On the basis of the probabilistic model we have
proved three theorems of quantization. They are
allowed to develop recommendations on the choice
of the number of gradations in sociological
questionnaires. The idea of "quantization" has
applications not only in sociology. We have noted,
that it can be used not only to select the number of
gradations. So, there are two very interesting
applications of the idea of "quantization" in
inventory management theory - in the two-level
model and in the classical Wilson model taking into
account deviations from it (shows that
"quantization" can use as a way to improve
stability). For the two-level inventory management
model we proved three theorems. We have
abandoned the assumption of Poisson demand,
which is rarely carried out in practice, and we give
generally fairly simple formulas for finding the
optimal values of the control parameters,
simultaneously correcting the mistakes of
predecessors. Once again we see the interpenetration
of statistical methods that have arisen to analyze
data from a variety of subject areas, in this case,
from sociology and logistics. We have another proof
that the statistical methods - single scientificpractical
area that is inappropriate to share by areas
of applications
The new results of the sample average values in different spaces and rules of large numbers for them are given in the article. We also introduced the weighted average values of type I corresponding to the sample, and type II, corresponding to the set of order statistics. The evolution of ideas about the Kemeny distance and the Kemeny median is traced. The modified Kemeny median, convenient for computation and avoiding the effect of the "center of the bagel hole" is proposed. As a generalization of the Kemeny median, we introduced and studied the empirical and theoretical values in the spaces of arbitrary origin. For them, we proved the rules of large numbers
In the article we have considered the basic idea of asymptotic mathematical statistics of interval data, in which the elements of a sample are not the numbers, but the intervals. Algorithms and conclusions of interval data statistics fundamentally different from
the classical ones. The results related to the basic concepts of notna and rational sample sizes are listed. Interval data statistics as an integral part of the system of fuzzy interval mathematics is shown
There is a need to clean up the classification methods. This will increase their role in solving applied problems, in particular, in the diagnosis of materials. For this, first of all, it is necessary to develop requirements that classification methods must satisfy. The initial formulation of such requirements is the main content of this work. Mathematical classification methods are considered as part of the applied statistics methods. The natural requirements to the considered methods of data analysis and the presentation of calculation results arising from the achievements and ideas accumulated by the national probabilistic and statistical scientific school are discussed. Concrete recommendations are given on a number of issues, as well as criticism of individual errors. In particular, data analysis methods must be invariant with respect to the permissible transformations of the scales in which the data are measured, i.e. methods should be adequate in the sense of measurement theory. The basis of a specific statistical method of data analysis is always one or another probabilistic model. It should be clearly described, its premises justified - either from theoretical considerations, or experimentally. Data processing methods intended for use in real-world problems should be investigated for stability with respect to the tolerances of the initial data and model premises. The accuracy of the solutions given by the method used should be indicated. When publishing the results of statistical analysis of real data, it is necessary to indicate their accuracy (confidence intervals). As an estimate of the predictive power of the classification algorithm, it is recommended to use predictive power instead of the proportion of correct forecasts. Mathematical research methods are divided into "exploratory analysis" and "evidence-based statistics." Specific requirements for data processing methods arise in connection with their "docking" during sequential execution. The article discusses limits of applicability of probabilistic-statistical methods. Concrete statements of classification problems and typical errors when applying various methods for solving them are also considered
The mathematical theory of classification contains a large number of approaches, models, methods, algorithms. This theory is very diverse. We distinguish three basic results in it - the best method of diagnosis (discriminant analysis), an adequate indicator of the quality of discriminant analysis algorithm, the statement about stopping after a finite number of steps iterative algorithms of cluster analysis. Namely, on the basis of Neyman - Pearson Lemma we have shown that the optimal method of diagnosis exists and can be expressed through probability densities corresponding to the classes. If the densities are unknown, one should use non-parametric estimators of training samples. Often, we use the quality indicator of diagnostic algorithm as "the probability (or share) the correct classification (diagnosis)" - the more the figure is the better algorithm is. It is shown that widespread use of this indicator is unreasonable, and we have offered the other - "predictive power", obtained by the conversion in the model of linear discriminant analysis. A stop after a finite number of steps of iterative algorithms of cluster analysis method is demonstrated by the example of k-means. In our opinion, these results are fundamental to the theory of classification and every specialist should be familiar with them for developing and applying the theory of classification