Novel Approach: Information Quantity for
Calculating Uncertainty of Mathematical Model
Extended abstract
B.M. Menin
Almost all the readers remember the idea of our distant ancestors, according to which our Earth is surrounded by a glass dome with stars and planets reinforced on it. However, Democritus tried to explain to the masses the simple truth that the Earth is just a small particle in a vast universe, but Aristotle's picture was closer to those in power, so it lasted thousands of years.
Everything comes to an end, and modern science, totaling only 500 years, has made a real revolution in the consciousness of human individuals. Today without genetics, Big Ben, information theory, quantum electrodynamics and the theory of relativity, it is difficult to imagine the realization of flights into space, genetic engineering, nuclear power plants and, simply in theory, the era of relative abundance.
Now it is possible due to the widely used method of modeling in recent decades. It is based on accounting for a huge number of variables, the use of powerful computers and modern mathematical methods. That is why, in the scientific community the prevailing view is the more precise the instrument used for the model development, the more accurate the results. For example, 3,000 parameters are used for the program “Energyplus” elaborated by US department of Energy. However, energy simulation results can easily be 50–200% of the actual building energy use.
What can be done in order to overcome the apparent contradiction? In this case, the theory of similarity comes to the rescue. Applying the theory of similarity is motivated by the desire to generalize obtained results in the future for different areas of physical applications. In the study of the phenomena occurring in the world around us, it is advisable to consider not individual variables but their combination or complexes which have a definite physical meaning. Methods of the theory of similarity based on the analysis of integral-differential equations and boundary conditions, allow the identification of these complexes. In addition, the transition from dimensional physical quantities to dimensionless variables reduces the number of variables taken into account. But this is not enough to simplify the calculations.
Human intuition and experience suggests the simple, at first glance, truth. For a small number of variables, the researcher gets a very rough picture of the process being studied. In turn, the huge number of accounted variables can allow deep and thorough understanding of the structure of the phenomenon. However, with this apparent attractiveness, each variable brings its own uncertainty into the integrated (theoretical or experimental) uncertainty of the model or experiment. In addition, the complexity and cost of computer simulations and field tests increases enormously. Therefore, some optimal or rational number of variables that is specific to each of the studied processes must be considered in order to evaluate the physical-mathematical model.
In this case, the theory of information came to the aid of scientists. It happened because of the fact that simulation is an information process in which a developed model receives information about the state and behavior of the observed object. This information is the main subject of interest in the theory of modeling.
The model is a framework of ideas and concepts from which a researcher/observer interprets his intuition, experience, observations and experimental results. It includes physical structure-model and mathematical structure-model. Physical model is expressed as a set of natural law’s inherent to the recognized object. It interprets a mathematical model, including its assumptions and constraints. Mathematical model is a set of equations using symbolic representations of quantitative variables in a simplified physical system. It helps modeler to understand and quantifies physical model, thus enabling the physical-mathematical model to make precise predictions and different applications.
The process of formulating a physical-mathematical model can be called processing information when the information and/or its initial representations about the object under study do not change, but new information is created. Physicists and engineers receive information about the observed process and can develop scientific laws and analyze natural phenomena or technical processes only on the basis of this information.
In other words, the observer knows about a particular phenomenon only if this object has a name in the observer's mind, and in his mind there are data that represent the properties of the object. It should be emphasized that any observer or group of scientists is not ideal, because, otherwise, they should be able to potentially acquire endless knowledge.
Thus, scientists came to the brilliant idea of quantifying the uncertainty of a conceptual model based on the amount of information embedded in the model and conditioned only by the selection of a limited number of variables that must be taken into account. This idea is based on thermodynamic theory, concepts of Mark Burgin’s general theory of information. It includes two axioms.
The first is that general knowledge of the world is significantly limited by the act of choice a System of Primary Variables. Whatever people know, all scientific knowledge, depends only on information framed by SPV. As an example of SPV, SI (International system of units), or CGS (centimeter-gram-second) may be offered. The number of dimensional variables included in SPV is finite. SPV is a set of variables (primary and, designed on their basis, secondary), which are necessary and sufficient to describe the known nature laws, as in quality physical content and quantitatively.
Secondly, the number of variables considered in the physical-mathematical model is limited. The limits of the description of the process are determined by the choice of the class of phenomena (CoP) and the number of secondary parameters considered in the mathematical model. CoP is a collection of physical phenomena and processes described by a finite number of primary and secondary variables that characterize certain features of the observed phenomenon from the qualitative and quantitative aspect.
For example, for the combined processes of heat exchange and electromagnetism, it is useful to use the primary SI dimensions: length L, M is weight, T is time, Θ is temperature, and I is electric current, i.e. CoPSI ≡ LMТQI. In thermodynamics, the basic set of primary dimensions often includes L, M, T, and the thermodynamic temperature Θ, that is, CoPSI ≡ LMТΘ. If SPV and CoP are not specified, then the definition of "information about the phenomenon being investigated" loses its validity, and the information quantity can increase to infinity or decrease to zero. Without SPV, the simulation of the process is impossible. As noted by the famous French physicist Brillouin, "You cannot get anything out of nothing, even observation." You can interpret SPV as the basis of all available knowledge that people can have about the surrounding nature at the moment.
To this we should add that the researcher chooses variables for the model describing the observed object, based on his experience, knowledge and intuition. These variables can fundamentally differ in nature, qualitatively and quantitatively, from another group of variables selected by another group of scientists. Such, for example, happened when studying the motion of an electron, like particle or wave. Therefore, it is possible to consider the choice of a variable as a random process, and accounting a particular variable will be equally probable. This approach completely ignores the human evaluation of information. In other words, a set of 100 notes played by chimpanzees, and a melody of Mozart’s 100 notes in his Piano Concerto No.21-Andante movement, have exactly the same amount of information.
It’s perhaps a surprising fact that basing on the above mentioned assumptions you can get a very simple, from the point of view of mathematics, formula for calculating the uncertainty of a mathematical model describing the observed phenomenon. And this uncertainty determines a limit on the advisability of obtaining an increase of the measurement accuracy in conducting pilot studies or computer simulation. It is not a purely mathematical abstraction and it has physical meaning. This relationship testifies that in nature there is a fundamental limit to the accuracy of measuring any observed material object, which cannot be surpassed by any improvement of instruments and methods of measurement. The value of this limit is much more than the Heisenberg uncertainty relation provides and places severe restrictions on the micro-physics.
The proposed method was used to analyze the results of measurements of the Boltzmann constant and the gravitational constant published in the scientific literature during 2005-2016. The results are in excellent agreement with the CODATA recommendations (Committee on Data for Science and Technology).