Some estimators of the probability density function in spaces of arbitrary nature are used for various tasks in statistics of non-numerical data. Systematic exposition of the theory of such estimators had a start in our work [2]. This article is a direct continuation of the article [2]. We will regularly use references to conditions and theorems of the article [2], in which we introduced several types of nonparametric estimators of the probability density. We studied more linear estimators. In this article we consider particular cases - kernel density estimates in spaces of arbitrary nature. When estimating the density of the one-dimensional random variable, kernel estimators become the Parzen-Rosenblatt estimators. Asymptotic behavior of kernel density estimators in the general case of an arbitrary nature spaces are devoted to Theorem 1 - 8. Under different conditions we prove the consistency and asymptotic normality of kernel density estimators. We have studied uniform convergence. We have introduced the concept of "preferred rate differences" and studied nuclear density estimators based on it. We have also introduced and studied natural affinity measures which are used in the analysis of the asymptotic behavior of kernel density estimators. We have found the asymptotic behavior of dispersions of kernel density estimators and considered the examples including kernel density estimators in finite-dimensional spaces and in the space of square-integrable functions
The article is devoted to the nonparametric point and
interval estimation of the characteristics of the
probabilistic distribution (the expectation, median,
variance, standard deviation, variation coefficient) of
the sample results. Sample values are regarded as the
implementation of independent and identically
distributed random variables with an arbitrary
distribution function having the desired number of
moments. Nonparametric analysis procedures are
compared with the parametric procedures, based on
the assumption that the sample values have a normal
distribution. Point estimators are constructed in the
obvious way - using sample analogs of the
theoretical characteristics. Interval estimators are
based on asymptotic normality of sample moments
and functions from them. Nonparametric asymptotic
confidence intervals are obtained through the use of
special output technology of the asymptotic relations
of Applied Statistics. In the first step this technology
uses the multidimensional central limit theorem,
applied to the sums of vectors whose coordinates are
the degrees of initial random variables. The second
step is the conversion limit multivariate normal
vector to obtain the interest of researcher vector. At
the same considerations we have used linearization
and discarded infinitesimal quantities. The third step
- a rigorous justification of the results on the
asymptotic standard for mathematical and statistical
reasoning level. It is usually necessary to use the
necessary and sufficient conditions for the
inheritance of convergence. This article contains 10
numerical examples. Initial data - information about
an operating time of 50 cutting tools to the limit
state. Using the methods developed on the
assumption of normal distribution, it can lead to
noticeably distorted conclusions in a situation where
the normality hypothesis failed. Practical
recommendations are: for the analysis of real data we
should use nonparametric confidence limits
In the statistical hypothesis testing, critical values
often point to a priori fixed (nominal) significance
levels. As such, typically researcher uses the values
of three numbers 0.01, 0.05, 0.1, to which may be
added a few levels: 0.001, 0.005, 0.02, and others.
However, for the statistics with discrete distribution
functions, which, in particular, include all
nonparametric statistical tests, the real significance
levels may be different from the nominal, differ at
times. Under the real significance level we refer to
the highest possible significance level of discrete
statistics, not exceeding a given nominal
significance level (ie, the transition to the next
highest possible value corresponding discrete
statistical significance level is greater than a
predetermined nominal). In the article, we have
discussed the difference between nominal and real
significance levels on the example of nonparametric
tests for the homogeneity of two independent
samples. We have also studied two-sample
Wilcoxon test, the criterion of van der Waerden,
Smirnov two-sample two-sided test, sign test, runs
test (Wolfowitz) and calculated the real significance
levels of the criteria for nominal significance level
of 0.05. The study of the power of these statistical
tests is accomplished by means of Monte Carlo
method. The main conclusion: the use of nominal
significance levels instead of real significance levels
for discrete statistics is inadmissible for small
sample sizes
This article is a continuation of the works [1,2], which were devoted to the study of hydrodynamics and transport of salt ions in the experimental electrochemical cell with a rotating disk with a cation exchange membrane of exact current modes, when the condition of local electroneutrality. This article presents a mathematical model of transport of salt ions in a cell with a rotating disk with a cation exchange membrane exorbitant current regimes, taking into account electroconvection. Under these conditions, fluid dynamics depends on the ion transport process salt and described by the system of Navier-Stokes equations in cylindrical coordinate system with the electric forces
Without science it would be impossible to form a full environmental consciousness. To increase the validity and weight of the findings on the impact of environment on quality of life, it is necessary to quantify the strength and direction of the influence of diverse environmental factors. However, it appears that this is quite problematic for a number of reasons. First, it is the lack or inaccessibility of source of data which is necessary for such type of research. The same data, which still can be found cover just small periods of observations (small longitudinal research data), and their completion, including performing experiments, is fundamentally impossible. As a result, it is impossible to require such full data replications, which is a necessary condition for correct applying of factor analysis. Secondly, environmental factors are described with heterogeneous indices measured in different types of measurement scales (nominal, ordinal and numerical) and in different measurement units. Mathematical methods of comparable processing of such data, and the right software tools for these methods, generally speaking, do not exist. Third, these tasks are large-scale problems, i.e. they are not talking about 5 or max 7 factors as it was in factor analysis, but about hundreds and thousands. Fourthly, the original data is noisy and require sustainable methods. Fifthly, environmental factors are interrelated and require nonlinear nonparametric approaches. To solve these problems it is proposed to apply a new innovative intelligent technology: automated system-cognitive analysis and its software tool – a system called "Eidos". We have also given a brief numerical example of assessing the impact of environmental factors on life expectancy and causes of death
We have proposed the method for testing of independence of two alternative variables on the basis of statistics of non-numeric data. The method is aimed at application in problems of statistical quality
control. Testing of independence is based on set of small samples, i.e., in the Kolmogorov’s asymptotics, when the number of unknown
parameters of the distribution increases in proportion to the data size
We propose a mathematical model of ion transport binary salt in electroosmotic flow in a capillary. The capillary is open on one side and immersed in a vessel of large volume, in which the concentration of the solution is maintained constant, and the other side closed ion exchange membrane. The walls are considered wettable, i.e. the solution adheres to the walls. This means that the mathematical modeling used to rate the condition of sticking. We study the boundary value problem for a coupled system of equations Nernst, Planck, Poisson and Navier-Stokes equations. Used boundary conditions of general form. The mathematical model is based on the general laws of transport and contains no adjustable parameters. Using this model, the basic laws of ion transport salt solution liquid flow, the emergence and development electroconvection, distribution of concentration of salt ions in the capillary with a small change in time, ie, in the initial (transitional) regime. We have identified the presence of ion-exchange membrane surface electroconvective vortices and their influence on the mechanisms of ion transport of salt and fluid movement in different areas of the capillary. A feature of the capillary transport is to the right of the vortex region stagnant areas with a higher concentration of ions
In the article the graph model of management of control of knowledge of pupils in indistinct conditions is offered. The model allows to define the optimum quantity and optimum placement of control actions for a studying course for each discipline
(each its fragment), and also to carry out an assessment of structure of knowledge
It is shown that the metric of the galaxy should be universal, depending only on the fundamental constants. There are examples of universal metrics obtained in Einstein's theory of gravitation and Yang-Mills theory. The axial-symmetric solutions of Einstein’s equations for a vacuum are applied to explain the rotation of matter in spiral galaxies
In various applications it is necessary to analyze
some expert orderings, ie clustered rankings of
examination objects. These areas include technical
studies, ecology, management, economics, sociology,
forecasting, etc. The objects may make samples of
the products, technologies, mathematical models,
projects, job applicants and others. We obtain
clustered rankings which can be both with the help
of experts and objective way, for example, by
comparing the mathematical models with
experimental data using a particular quality criterion.
The method described in this article was developed
in connection with the problems of chemical safety
and environmental security of the biosphere. We
propose a new method for constructing a clustered
ranking which can be average (in the sense,
discussed in this work) for all clustered rankings
under our consideration. Then the contradictions
between the individual initial rankings are contained
within clusters average (coordinated) ranking. As a
result, ordered clusters reflects the general opinion
of the experts, more precisely, the total that is
contained simultaneously in all the original
rankings. Newly built clustered ranking is often
called the matching (coordinated) ranking with
respect to the original clustered rankings. The
clusters are enclosed objects about which some of
the initial rankings are contradictory. For these
objects is necessary to conduct the new studies.
These studies can be formal mathematics
(calculation of the Kemeny median, orderings by
means of the averages and medians of ranks, etc.) or
these studies require involvement of new information
from the relevant application area, it may be
necessary conduct additional scientific research. In
this article we introduce the necessary concepts and
we formulate the new algorithm of construct the
coordinated ranking for some cluster rankings in
general terms, and its properties are discussed