Spatial and dynamic models. Classification of types of modeling. Dynamic models. Examples of building dynamic models Methods of interpolation by area

3D cartographic images are electronic maps of a higher level and represent spatial images of the main elements and objects of the area visualized using computer modeling systems. They are intended for use in control and navigation systems (ground and air) for terrain analysis, solving calculation problems and modeling, designing engineering structures, and environmental monitoring.

Simulation technology terrain allows you to create visual and measurable perspective images that closely resemble the real terrain. Their inclusion according to a certain scenario in a computer film allows, when viewing it, to “see” the terrain from different shooting points, in different lighting conditions, for different seasons and days (static model) or to “fly” over it along given or arbitrary trajectories of movement and speed flight - (dynamic model).

The use of computer tools, which include vector or raster displays that allow the conversion of input digital information into a given frame in their buffer devices, requires the preliminary creation of digital spatial terrain models (STM) as such information.

Digital PMM in essence represent a set of digital semantic, syntactic and structural data recorded on computer media, intended for reproduction (visualization) of three-dimensional images of terrain and topographic objects in accordance with specified conditions of observation (review) of the earth's surface.

Initial data for creating digital PMMs may include photographs, cartographic materials, topographic and digital maps, city plans and reference information that provide data on the position, shape, size, color, and purpose of objects. In this case, the completeness of the PMM will be determined by the information content of the photographs used, and the accuracy - by the accuracy of the original cartographic materials.

Technical means and methods for creating PMM

Development of technical means and methods for creating digital PMMs is a difficult scientific and technical problem. The solution to this problem involves:

Development of hardware and software tools for obtaining primary three-dimensional digital information about terrain objects from photographs and map materials;
- creation of a system of three-dimensional cartographic symbols;
- development of methods for generating digital PMMs using primary cartographic digital information and photographs;
- development of an expert system for forming the content of the PMM;
- development of methods for organizing digital data in the PMM bank and principles for constructing the PMM bank.

Development of hardware and software obtaining primary three-dimensional digital information about terrain objects from photographs and map materials is due to the following fundamental features:

Higher, compared to traditional digital digital computers, requirements for digital digital digital computers in terms of completeness and accuracy;
- using as initial decoding photographs obtained by frame, panoramic, slit and CCD filming systems and not intended to obtain accurate measurement information about terrain objects.

Creation of a system of three-dimensional cartographic symbols is a fundamentally new task of modern digital cartography. Its essence is to create a library of symbols that are close to the real image of terrain objects.

Methods for generating digital PMMs using primary digital cartographic information and photographs must ensure, on the one hand, the efficiency of their visualization in the buffer devices of computer systems, and, on the other hand, the required completeness, accuracy and clarity of the three-dimensional image.

Research currently being carried out has shown that to obtain digital PMMs, depending on the composition of the source data, methods using:

Digital cartographic information;
- digital cartographic information and photographs;
- photographs.

The most promising methods seem to be, using digital cartographic information and photographs. The main ones may be methods for creating digital PMMs of varying completeness and accuracy: from photographs and DEMs; from photographs and digital digital materials; from photographs and DTM.

The development of an expert system for forming the content of the PMM should provide a solution to the problems of designing spatial images by selecting the object composition, its generalization and symbolization, and displaying the display in the required map projection. In this case, it will be necessary to develop a methodology for describing not only conventional signs, but also the spatial-logical relationships between them.

The solution to the problem of developing methods for organizing digital data in a PMM bank and the principles of constructing a PMM bank is determined by the specifics of spatial images and data presentation formats. It is quite possible that it will be necessary to create a space-time bank with four-dimensional simulations (X, Y, H, t), where PMMs will be generated in real time.

Hardware and software tools for displaying and analyzing PMM

The second problem is development of hardware and software display and analysis of digital PMMs. The solution to this problem involves:

Development of technical means for displaying and analyzing PMM;
- development of methods for solving calculation problems.

Development of hardware and software display and analysis of digital PMMs will require the use of existing graphic workstations, for which special software (SPO) must be created.

Development of methods for solving calculation problems is an applied problem that arises in the process of using digital PMMs for practical purposes. The composition and content of these tasks will be determined by specific PMM consumers.

NATURAL AND TECHNICAL SCIENCES

UDC 519.673: 004.9

INTERPRETATION OF THE CONCEPTUAL MODEL OF A SPATIAL DYNAMIC OBJECT IN THE CLASS OF FORMAL SYSTEMS*

A.Ya. Friedman

Institute of Informatics and Mathematical Modeling KSC RAS

Annotation

The issues of modeling complex dynamic objects (SDO) in weakly formalized subject areas are considered. For the previously proposed situational conceptual model of such objects, an interpretation has been developed in the class of semiotic formal systems, which makes it possible to integrate various means of studying LMS, providing joint logical and analytical data processing and situational analysis of the state of the object under study using expert knowledge and taking into account spatio-temporal dependencies in the characteristics of LMS , performed using cartographic information.

Key words:

conceptual model, spatial dynamic object, semiotic formal system.

Introduction

This paper examines the issues of modeling LMS in weakly formalized subject areas. In addition to structural complexity, the peculiarity of LMS is that the results of their functioning significantly depend on the spatial characteristics of their component parts and on time.

When modeling a LMS, it is necessary to take into account a variety of information, financial, material, and energy flows, provide for an analysis of the consequences of changing the structure of an object, possible critical situations, etc. The fundamental incompleteness of knowledge about such objects limits the applicability of classical analytical models and determines the focus on using the experience of experts, which, in turn, is associated with the creation of appropriate means of formalizing expert knowledge and their integration into the modeling system. Therefore, in modern modeling the role of such a concept as a conceptual domain model (CMDO) has increased significantly. The basis of KMPO is not an algorithmic model of data transfer and transformation, as in analytical models, but a declarative description of the structure of an object and the interaction of its component parts. Thus, KMPO is initially focused on formalizing the knowledge of experts. In KMPO, the elements of the subject area under study are defined and the relationships between them are described, which define the structure and cause-and-effect relationships that are significant within the framework of a particular study.

The situational modeling system (SMS) presented in this work based on a tree-like situational conceptual model (SCM) is one of the options

* The work was partially supported by grants from the Russian Foundation for Basic Research (projects No. 13-07-00318-a, No. 14-07-00256-a,

No. 14-07-00257-a, No. 14-07-00205-a, No. 15-07-04760-a, No. 15-07-02757-a).

implementation of technologies such as CASE (Computer Aided Software Engineering) and RAD (Rapid Application Development).

Semiotic formal systems

The main advantage of logical calculus as a model for representing and processing knowledge is the presence of a uniform formal procedure for proving theorems. However, it also entails the main disadvantage of this approach - the difficulty of using heuristics that reflect the specifics of a specific problem environment when proving. This is especially important when building expert systems, the computing power of which is mainly determined by knowledge characterizing the specifics of the subject area. Other disadvantages of formal systems include their monotony (the inability to abandon conclusions if an additional fact becomes true, and in this sense they differ from reasoning based on common sense), the lack of means for structuring the elements used, and the inadmissibility of contradictions.

The desire to eliminate the shortcomings of formal systems when used in artificial intelligence led to the emergence of semiotic systems formalized by the figure eight:

S::= (B, F, A, R, Q(B), Q(F), Q(A), Q(R)). (1)

In (1), the first four components are the same as in the definition of a formal system, and the remaining components are the rules for changing the first four components under the influence of the experience accumulated in the knowledge base about the structure and functioning of entities in a given problem environment. The theory of such systems is at an early stage of development, but there are many examples of solving specific problems within the framework of this paradigm. One such example is described below.

Basics of situational modeling

When setting a problem and preparing the modeling process, CMPO is intended to represent knowledge about the structure of the subject area under study. For KMPO elements, there is a correspondence between the real world object itself and its model representation. To ensure the possibility of automating the subsequent stages of modeling, the model of the subject area is mapped onto a formal system adequate to it. This transition is realized during the construction of the CMPO by assigning each of its elements a certain formal description. As a result, the completion of the construction of CMPO will correspond to the transition from informal knowledge about the subject area under study to their formal representation, allowing only an unambiguous procedural interpretation. The resulting formal model is declarative in nature, since it primarily describes the composition, structure and relationships between objects and processes, regardless of the specific method of their implementation in the computer.

The declarative language for describing SCM consists of two parts: a part corresponding to the objects of the described world, and a part corresponding to the relationships and attributes of the objects presented in the model. Axiomatic set theory is used as the mathematical basis of the declarative language.

The SCM describes three types of elements (entities) of the real world - objects, processes and data (or resources). Objects reflect the organizational and spatial structure of the research object; a set of processes can be associated with each of them. A process is understood as some action (procedure) that transforms a subset of data, called input in relation to the process under consideration, into another subset of them,

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

called a day off. The data characterizes the state of the system. They are used in the implementation of processes and serve as the results of their execution. The execution of any process changes data and corresponds to the transition of the system from one state to another. The relationships and interactions of real-world objects are described in the model using relationships defined on sets of objects, processes and data. Each relationship connects a model element with some set of other elements.

The names of SCM elements are given in terms of the subject area. Each element of the model is assigned an executor who ensures its implementation during the simulation. The executor type determines the characteristics of the implementation, such as the programming language in which the executor of the corresponding process is written, and the type of executor in the algorithmic language.

Attributes describing the type of hierarchy relationship specify the representation of model objects at the next, lower level of the hierarchy. The composition (&) relation type specifies that an object is constructed by an aggregation of its subobjects. The classification type (v) indicates that a top-level object is a generalization of a group of lower-level objects. The “classification” type relationship in SCM is used to represent different variants of a top-level element. The “iteration” type (*) allows you to define iterative processes in SCM and describe regular data structures.

Depending on the type of hierarchy relationship, the object is assigned a control data. Control data is used to further define the structure of processes that have the “classification” or “iteration” hierarchy relationship type, and data that has the “iteration” hierarchical relationship type.

The formal representation of SCM makes it possible to significantly automate the analysis of the correctness of the structure and solvability of SCM.

An important aspect of the effectiveness of SCM is the convenience of presenting simulation results. Currently, the most promising environment for computerized research of LMS class objects is considered to be a geographic information system (GIS). In addition to advanced visualization and graphical processing of data, GIS tools in principle allow the formulation of tasks for spatially coordinated calculations in a user-friendly graphical environment, although this requires additional software development. In addition, GIS packages are not designed for analyzing the dynamics of an object and serious mathematical processing of data.

Another advantage of GIS within the framework of the problem under consideration is that each graphic element can be associated with additional database fields that can be modified by external computing modules, in contrast to graphic attributes. In particular, these fields can store the attributes of the conceptual model related to a given element, and other parameters necessary for organizing and conducting modeling.

Thus, each cycle of calculations during modeling includes three stages: setting the calculation conditions, the calculation itself, and output of the results. The informal goal of SCM development is to automate all these stages while providing maximum service to the non-programming user, that is, using domain terminology and a friendly user interface with the computer. For the same reasons, the SMS must be functionally complete, that is, provide the user with all the tools he needs without explicitly accessing other software environments. Creating specialized graphic libraries and report generation tools would require unreasonable programming costs and significantly lengthen development time. Therefore, a compromise solution seems appropriate: assign data output tasks to standard packages or specialized software modules, but automate their work to the maximum extent, eliminating dialogue with the user in their environment.

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Interpretation of the conceptual model...

Formal description of SCM

SCM is based on the representation of the modeling object in the form of a tree-like AND-OR graph, displaying a hierarchical decomposition of the structural elements of the LMS in accordance with their organizational connections.

To avoid computational problems associated with small changes in data and to provide support for joint computational and logical data processing, in SCM the output data of processing procedures (with the exception of data calculated by GIS) can only be data with a discrete finite set of values (such as lists). If the values of a certain data are string constants, then such a data is called a parameter (category PAR), and one that has numerical values is called a variable (category VAR), and certain mathematical operations can be performed on it. If the result of the calculation is the value of a variable, it is rounded to the nearest value in the list of valid values. In the future, if the above applies to data of any type allowed in SCM, the term “data” is used. Thus, the set of data names is divided into sets of variable and parameter names:

D::=< Var, Par >, Var::= (var ), i = 1, N ;

7 7 k l 7 v 7 (2)

Par::=(parj), j = 1, Np, where Nv and Np are the powers of these sets.

The data models the resources (quantitative characteristics) of objects or processes (category RES), variables can also be used as tuning parameters of functions (criteria) for the quality of functioning of SCM elements (category ADJ). Accordingly, the set of variable names is divided into a subset of names of resources of SCM elements and a subset of names of setting parameters of the quality criteria of these elements:

Var::=< Res, Adj > (3)

A separate category (GIS category) consists of graphical characteristics of SCM objects, directly calculated in GIS. All of them belong to variables, but are not considered as lists, since they are used only as input resources of model elements and do not change during the simulation.

SCM objects have three main characteristics: a name, a functional type that defines the structure and functions of the object and is used in the process of analyzing the correctness of the SCM, and the name of the superobject that dominates this object in the SCM (absent for the top-level object). According to their position in the object tree and on the map, three categories of SCM objects are distinguished: primitives (LEAF category), structurally indivisible from the point of view of the global modeling goal, elementary objects (GISC category), geographically associated with one GIS element (polygon, arc or point of some -coverings), and composite objects (COM category), consisting of elementary and/or composite objects. The structure of objects of the GISC category in SCM can be quite complex, but all of their subobjects have the same geographic location. A set of objects forms a hierarchy:

О = (а 0Уа)::=2°а, (4)

where a = 1, Nl is the number of the level of the object tree to which this object belongs (L is the total number of decomposition levels);

vb = 1, Nb - serial number of the object at its decomposition level;

r = 1, N6_ - serial number of the superobject that dominates a given element at the overlying level;

About - a set of objects belonging to level number a.

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

To ensure the coherence of the SCM, it is assumed that there is a single superobject that dominates all objects of the first level of decomposition, that is, the following relation is valid:

O. -i0.”) 0, = (5)

Processes in SCM display data transformations and are implemented in various ways depending on one of the following three categories assigned to the process: internal processes (INNER category), all their input and output data belong to one object; intra-level processes (INTRA category), connecting SCM objects that are not subordinate to each other; inter-level processes (INTER category), describing the transfer of data between an object and subobjects or between an object and a superobject. The introduced categorization of processes somewhat complicates the process of creating SCM (in some cases it may be necessary to create fictitious processes that provide such typification), but makes it possible to make the procedures for formal control of SCM much more complete and detailed.

The main characteristics of processes: a unique name, characteristics of the process executor and the functional type of the process, which determines the type of transformations it carries out and is used in the process of analyzing the correctness of the SCM; Additionally, a list of input and output data and their permissible boundary values is used. The process executor specifies its dynamic properties and the method of implementation in the computer. The executor can be specified either directly (in the form of a difference equation), or indirectly - by reference to the name of the software module that implements this process.

The schema of the conceptual model is formed by the tuple:

^SSM::=<о,P,DCM,H,OP,PO,U >, (6)

where O is the set of KMPO objects (9);

P::= (pn I n = 1, Np - set of KMPO processes;

DCM with D is the data set of the conceptual model, where D is defined in (4), (5);

H is the relation of the hierarchy of objects, which, taking into account (4) and (5), will take the form:

where Hb with O6x B,(O6) are the hierarchy relations for each level of the object tree, and b"(o6) is a partition of the set Oa;

OP with O x B (P) - the relation “object - processes generating its output data”, and B (P) is a partition of the set P;

PO with P x B(O) - the relationship “process - objects creating its input data”;

U::= Up and U0 - a relation that formalizes the control of the calculation process based on SCM, has components of the following form:

U c P x B(Res) - the relationship “process - control data”;

Uo с О x B(Res) - the relation “object - control data”.

The relationship “object (process) - control data” associates a data with a certain object (process) of the model, which further defines this object when moving to an algorithmic interpretation. Data transfer between objects is carried out only through lists of input and output data of these objects, which is consistent with the principles of data encapsulation adopted in modern object-oriented programming. All processes assigned to one object are described by the relation OA with O x B(P) “object - processes assigned to it.” This relationship is not included in the diagram

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Interpretation of the conceptual model...

SCM, since, unlike the relations H, OR and RO, it is not specified by the user when constructing the model, but is generated automatically.

The relations defined in the model are conveniently represented in the form of functions (7), partially defined on the sets O and P, with ranges of values B(P), B(O) or B"(Ob). Names

functions are indicated by lowercase characters corresponding to uppercase characters in the names of relations:

h:°b_1 ^B"(Oa),(Vo;. e06,Vo! e°b_Hoj = hb(o))оojHbog); op . O ^ B(p^ (Vo e O, Vp e r)(( p; = opio)) "■ o,Opp]);

Po.p ^ b(0), (vo e O, VP] e p)((o = po(P])) « P]OPot);

oa: O ^ B(P),(VOi e O, Vp) e P)((p) = oa(ot))otOAp));

: p ^ B(Res\(vPi e p, Vres] e Res)((res] = up (pi)) ptUpres]);

: O ^ B(Res), (Vo1 e O, VreSj e Res)((resj = uo (o1)) o1Uo resj).

Sets of values of functions (7), which form sections of the ranges of values of the introduced relations along some element of the domains of their definition, are indicated in bold:

h6 (oi)::= \P] : o] = ha(oi)); oP(oi) ::= \P] : P] = oP(oi));

po(P]) ::= (o: oi = po(p])); oci(pi) ::= ^ . p) = oa(oi)); (8)

up (Pi) ::= \res]: res] = up (Pi)); uo (o) ::= \res]: res] = uo (o)).

Similarly to (8), sections of the introduced relations are written over subsets of their domains of definition, constructed as unions of all sections over the elements of these subsets. For example, h (Oi), where Oi c O6_x, is a set of objects at level a, dominated by a given subset of objects oj e O t, which are at level a - 1.

Below we also use the set of subordination of the object oi h ’(oi)::= U h(oi).

The developed algorithms for assigning categories to SCM elements use the relationships described above and identify all possible errors in the categorization of model elements. Procedures for monitoring the correctness of appointments of performers of SCM elements use the following restrictions (evidence is given in).

Theorem 1. In the final SCM, recursive decomposition of object executor types cannot take place, that is, not a single object included in the set of subordination of a certain object can have an executor of the same type as the original object.

Theorem 2. In a finite SCM, an inversion of the subordination of object executors cannot take place, that is, no object included in the subordination set of some object with an executor of type e1 can have an executor of the same type as any other object in whose subordination set contains any object with an executor of type e1.

Principles of SCM solvability control

The construction of a correct model, carried out in accordance with the rules adopted in the SSM, does not guarantee that this model is solvable, that is, it is possible to solve all the problems declared in it. Solvability in the general case means the reachability of a certain subset of model objects, which are defined as target, from another subset of objects, which are defined as source. Solvability can be considered in two main aspects: when analyzing the entire model as a whole (before the start of calculations), it implies consistency and unambiguity in the description of all acceptable options for achieving a global goal at various levels of the hierarchy, and in the process

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

When implementing modeling, solvability consists in ensuring the selection of the correct fragment of the model that describes the situation being studied. The functional difference between the listed aspects is that when analyzing the entire model, only the potential possibility of modeling all objects described in the model is assessed, and when analyzing a specific situation, the additional task of selecting the minimum fragment that describes this situation and quantitatively comparing the possible alternatives contained in it are set . The second aspect of solvability is studied in , and the features of the analysis of the solvability of SCM as a whole are presented here, which is automatically carried out after checking its correctness is completed, and can be performed at any time at the user’s request. In general, the problem of solvability analysis can be formulated in the following form: two sets of model elements are indicated - the initial and the target, and the model is solvable if there is a sequence of steps that allows obtaining the target set from the initial one. Simple wave algorithms are suitable for this.

When analyzing both aspects of decidability, the conceptual model is treated as a formal system. Its alphabet includes:

symbols indicating model elements (pi, on, resj, ...);

functional symbols describing relationships and connections between model elements (ha, op,...);

special and syntactic symbols (=, (,), ^,...).

The set of formulas in the formal system under consideration form: the actual symbols denoting the elements of KMPO:

(Pi e P) u (Oj eO] u (resk e DCM); (9)

expressions (7), (8) and other formulas for calculating functions and sets defined using relations introduced over sets (5);

computability expressions for each process of the conceptual model:

list_in(pi) \ list out(pi), Up(pi) [, sp)] ^ p„ list_out(p,), (10)

where, due to the assumption adopted in the SSM about the autonomy of the structure of each object, the set s(p) of processes preceding pi can only include processes assigned to the same object:

s(pi) with оа(оа"1(р1)); (11)

computability expressions for each object of the conceptual model: list_in(oi), up(Oj), oа(o,), h(o,) ^ oi, list_out(oi); (12)

expressions for the computability of the input data of each object of the conceptual model that receives material resources from other objects (og: oo(o) Ф 0):

00(0,) ^ list_in(oi). (13)

Expressions (9)-(13) include only material resources, that is, they do not analyze the output data of the adjustment and feedback processes related to the SCM information resources. In addition, the computability of the sets defined in the premises of these expressions is stated under the condition that all elements of the specified sets are computable.

The first premise of proposition (10) requires additional justification. As is known, in the presence of cycles on resources in the subject area, data may appear that, when constructing a conceptual model, must be declared as input and output for some CMPO process simultaneously. According to the assumption adopted in the SSM, such cycles are included inside KMPO objects, that is, they must be taken into account when analyzing solvability at the process level.

If, when analyzing the solvability of SCM, we use the computability expression proposed in and which takes the form for SCM:

list_in(p,) & up(p,) [& s(p,)] ^ p, & list_out(p,), (14)

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Interpretation of the conceptual model...

then the model will not be able to include resources that simultaneously serve as input and output data of the same process, that is, to describe recurrent computation processes that are often encountered in practice. A way out of the situation is given by the theorem below, proven in the work.

Theorem 3. A resource that is simultaneously an input and an output for the same SCM process and is not an output for any of the processes preceding it, related to the specified process by the process generation relation (13), can be excluded from the left side of the computability proposal without violating the correctness of the analysis solvability of the model.

The set of axioms of the formal system under consideration includes:

axioms of computability of all resources related to external data (having executors of type DB, GISE or GEN)

|- resj: (ter(resj) = DB) v (ter(resj) = GISE) v (tS[(resJ) = GEN); (15)

axioms of computability of all GIS elements of SCM (the types of which begin with the symbols dot, pol or arc)

|- 0J:<х>dot) v (to(o/) Yu pol) V (to(oj) Yu arcX (16)

where by symbol the inclusion of standard GIS types in the functional type of an object is conventionally indicated.

In the formal system under consideration, two inference rules are specified:

rule of direct consequence -

Fi, Fi ^ F2 |- F2; (17)

following rule with equality -

Fi, Fi = F2, F2 ^ F3 |- F3, (18)

where F, are some formulas from (9)-(13).

The structure of the described formal system is similar to the structure of the system proposed in. A significant difference is the type of computability expressions (10), (12), (13) and the composition of the axioms on the basis of which the solvability of the conceptual model is analyzed.

The totality of knowledge about the subject area presented in the SCM can be considered correct if, at various levels of the hierarchy, the conceptual model actually presents mutually agreed upon specifications of objects and processes that ensure the correct generation of resources for the functioning of objects at higher levels. The correspondence of specifications at all levels leads to the fact that the conceptual model fully characterizes the root object corresponding to the global problem that the system as a whole solves. A conceptual model is decidable if, in its corresponding formal system, there is a derivation of each computability theorem from a set of axioms and other theorems.

Definition 1. The SCM is solvable if and only if, for each element of the model not included in the set of axioms, the application of computability expressions of the form (10), (12), (13) to the axioms and already proven formulas (the set of theorems T) allows us to construct derivation using rules (17), (18) from the set of axioms (A) of the formal system (9)-(13).

When analyzing solvability, which, according to Definition 1, is a type of methods for automatically proving theorems, the concept of “inference mechanism” is used, in this case it is understood as a method, an algorithm for applying inference rules (17), (18), providing effective proof of all required a set of formulas from the set T of theorems (that is, syntactically correctly constructed formulas) of the formal system under consideration. The simplest way to organize inference is a “streaming” mechanism, in which the set of formulas considered to be proven A", initially equal to the set of axioms (A1 = A), is expanded as a result of the application of inference rules. If after some time T with A", then the model is solvable , if this is false and none of the rules can be applied, then the SCM is undecidable.

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

As a proof strategy used in the analysis of a general conceptual model, a bottom-up strategy is proposed, which consists of cyclically performing the following stages.

Stage I. Rule (17) is applied to obtain all possible consequences from the formulas and axioms.

Stage II. Rules (17), (18) are applied to obtain all possible consequences from the axioms and formulas obtained at the previous stage of the proof.

Stage III. Rule (13) is applied to expand the list of objects considered computable.

It has been proven that for correct conceptual models constructed according to the rules described above, the analysis of the solvability of the model as a whole comes down to the analysis of the solvability of the individual process templates of the INTRA category and aggregation processes included in it.

Handling situations

The theory of situational management notes the fundamental importance of developing procedures for generalizing situation descriptions based on their classification using a set of pragmatically important features, which itself is subject to synthesis. The fundamental features of the formation of concepts and classification in situational management include:

The presence of generalization procedures based on the structure of relationships between elements of situations;

Ability to work with names of individual concepts and situations;

The need to coordinate the classification of situations on some basis with the classification based on a set of influences (controls).

To implement the listed principles of classification and generalization of situations, the SMS provides a number of software tools:

An apparatus for the synthesis and analysis of types of situations, in particular, optimal sufficient situations, focused on solving issues of coordination and coordination of control actions at various levels of SCM;

Tools for generating and testing hypotheses about the comparative characteristics of sufficient situations within the framework of the probabilistic interpretation of these hypotheses, taking into account the influence of instrumental errors in the source data on the modeling results;

Procedures for generalizing descriptions of situations taking into account spatiotemporal relationships between elements of situations, using a library of spatiotemporal functions (STF).

Synthesis and analysis of types of situations. As a result of classifying situations using algorithms developed for SSM, a large number of classes of situations are generated, obtained for various decision-making objects (DMOs) and various leaf objects of fragments. In order to accumulate knowledge about the results of classification in the SMS, it is proposed to use means of generalizing descriptions of situations according to synthesized types of these situations. This method specifies general recommendations for constructing a hierarchical description of situations in situational management systems. Similar to the description of a complete situation, a generalized description of each sufficient situation is constructed based on the enumeration of the leaf objects included in it and the OPD, which uniquely defines it due to the tree-like nature of the decomposition of SCM objects. To synthesize a generalized description of the situation at the first level of the hierarchy of descriptions, the same procedure is used, which ensures the generation of types of executors of objects according to the types of processes assigned to them. The initial data in it are the types of leaf objects and the OPD of the sufficient situations studied, and the result of the work is

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Interpretation of the conceptual model...

a unique type of sufficient situation, supplemented by the serial number of its class and its number in this class. In contrast to the lexicographic order, which is used when generating types of object executors, here the types of objects included in the situation are ordered by their position in the object tree (4). The ordinal number of a class is determined by the number of the resource dominant in this class, according to the list of output resources of the ODA, and the ordinal number of a situation within a class is determined by its preference. The optimal sufficient situation of this class receives number 1. It is natural to consider the absolute scale of classification of situations to be their classification according to the global quality criterion, that is, according to belonging to a particular class of situations that ensure the dominance of one of the output parameters of the global SCM object according to generalized costs, which are calculated according to the criterion quality of ODA in this sufficient situation. The first key when constructing a situation type is its serial number within the class, then comes the OPR number, then the indices of the types of the list of leaf objects, and at the end - the class number. The described indexing procedure is used for the convenience of generating queries like: “Find, among the optimal sufficient situations of a certain given level, a situation that constitutes a subgraph of such and such a global optimal situation,” which are typical when solving problems of coordination of controls at various levels of decision-making.

The task of generalizing descriptions of situations in the SMS based on the types of situations includes two main stages: the search for common features of situations that fall into one class for each studied fragment of the CMPO, and the search for occurrences of situations in situations of higher levels (the height of the level here is set by the level of location of the OPR). The general scheme of reasoning during generalization fits well into the ideology of the JSM method. However, the software implementation of the JSM method in SSM would require a very significant amount of programming, so a probabilistic inference mechanism was used, implemented in the OES SSM shell, that is, instead of assessing the validity of certain hypotheses calculated according to the JSM method, special functions for recalculating conditional probabilities were used cause-and-effect relationships between configurations of sufficient situations and the results of their classification.

As follows from the described method of typing situations in the SMS, descriptions of sufficient situations classified according to one fragment of the KMPO are qualitatively different in the lists of their leaf objects, which together form a partition of the set of leaf objects used in constructing the fragment of the complete situation. Therefore, when generalizing their descriptions, the similarity method and the difference method are mainly used, and substrings of concatenation of leaf object types are used as premises. The results of generalization are formed in the form of two sets of rules, the first includes positive examples, the second - negative ones. According to formulas similar to the conversion of a priori probabilities into a posteriori ones, the presence of positive examples leads to an increase in the conditional probability of the corresponding rule, and the degree of increase is proportional to the ordinal numbers of the situations used in this example, and the presence of negative examples reduces the conditional probability of the rule to the same extent. After the end of the first stage of generalization, rules with a probability less than 0.5 are rejected.

At the second stage of generalization, similarities are found between situations at different levels. The same generalization mechanism is used, but the synthesized rules reflect the conditional probabilities of the occurrence of sufficient situations of lower levels of decomposition as part of sufficient situations of higher levels and, in particular, global sufficient situations by assessing the frequency of occurrence of types of underlying situations in types of overlying ones. In this way, an attempt is made to compare the classes of situations compiled for OPD of various levels, which, with a sufficient number of training examples, makes it possible to compile

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

hierarchical classification of sufficient situations indicating situations that are optimal for transferring an object to a certain state from a given class.

Another group of rules is focused on assessing the effectiveness of the alternatives contained in the KMPO. The idea of the search is as follows: the degree of effectiveness of a particular alternative (both for processes and objects) is higher, the wider the set of classes of situations in which sufficient situations with different variants of this alternative fall. And vice versa: if none of the available choice options changes the class of a sufficient situation, then this alternative is not offered to the user when expanding the minimum complete situations, at least for the same OPD, which makes it possible to speed up the process of classifying situations. On the other hand, it is desirable to be able to determine in advance the set of properties that the most “radical” alternatives possess, or rather, several sets - for each potentially desirable option for changing areas of dominance.

All the rules obtained during the generalization (in the terminology of situational management, they refer to logical-transformational rules) are stored in the ES SSM and are used as control formulas in the process of classifying situations. One more feature of the developed probabilistic inference mechanism should be noted - the ability to reduce the influence of errors in the source data on the results of generalizing situations by taking into account the probability of erroneously classifying a situation into one class or another. Let's consider the main idea of its use to increase the reliability of generalization of situations.

When classifying sufficient situations of a certain fragment of the SCM, errors may occur due to the structural instability of the process of calculating costs when they are transferred between model elements. For example, if cycles on resources are allowed in KMPO, then when the current value of any resource participating in the cycle changes, the class of the sufficient situation where the costs of this resource are calculated can change significantly, which, in the author’s opinion, violates the stability of the classification and generalization procedures. It is proposed to reject such situations from generalization procedures, for which the SMS recommends using procedures for checking the dependence of the results on possible modeling errors. If, when analyzing the influence of modeling errors for a certain SCM resource, an excess of the share of changes in costs at the output of the OPR is revealed compared to the share of the test change in the current value of the resource, such a resource is considered unreliable, the probability of failure when using it for classification is taken to be proportional to the degree of the said excess. If the probability of failure exceeds the specified threshold value (the default threshold probability is 0.3), then this resource is excluded from the classification procedures. Otherwise, the classification of situations is still carried out, but taking into account the likelihood of failures, which, in principle, leads to a decrease in the contrast of classification procedures and, as a consequence, to a decrease in the likelihood of including situations involving an unreliable resource in the category of optimal or highly preferable.

Analysis of spatiotemporal dependencies. Working with spatio-temporal dependencies is carried out using a library of spatio-temporal functions (STF) - software modules that provide selection of relevant information for the current request from the corresponding source data bases (SID), entering this information into the main database and processing it to make a decision on the truth or falsity of the condition forming the request. Therefore, in the general case, the program of each PVF includes three parts: a BID driver that organizes the interface of the main database and the BID, a program for writing query results to the main database, and a program for interpreting query results. In this case, a change in the subject area leads to the need to modify only the BID drivers.

All PVFs have a logical type output, that is, they return a “yes” or “no” answer as a result of analyzing the logical condition included in them. Two types of time and three types of spatial functions have been developed.

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Interpretation of the conceptual model...

The INTERVAL time function supports sampling of historical data over a certain period of time, its syntax is as follows:

during_during (<условие>,<начало>,<конец>,<доля>), (19)

Where<условие>may look like:

<имя> <знак> <подсписок_значений (n)>, (20)

it defines the controlled characteristic of an array element;

<начало>And<конец>the initial and final moments of the verification interval are set respectively (their distance in the past from the current moment in time);

<доля>defines the minimum acceptable percentage (number) of elements among all analyzed elements that must satisfy<условию>so that function (19) gives an affirmative answer to the request.

If a zero parameter value is entered<начало>, all available information is analyzed up to the point in time<конец>. Similarly, with a zero value of the parameter<конец>, data from the moment is analyzed<начало>up to the current moment in time. If the values coincide<начало>And<конец>only one point in time in the past is considered.

The following function allows you to time-bind the stored data

to the time point specified in the request:

moment (<условие>,<время>,<доля>), (21)

Where<условие>And<доля>are formed similarly to function (19), and<время>- a fixed point in time for which the operation is performed.

Spatial functions are written in the form:

neighboring (<условие>,<доля>) (22)

similar (<условие>,<доля>,<параметры_сходства>). (23)

Options<условие>And<доля>are specified as in functions (19), (21); the difference between the types of spatial functions lies in the criteria for selecting elements for joint analysis: in function (22) elements that are geometrically adjacent to the current one are analyzed, in function (23) elements are selected that have the same values as the current element<параметров_сходства>, selected from a list of names of existing parameters and variables. For example, in the application of SSM to the problem of predicting rock bursts<параметр_сходства>had the name “fault” and was used for joint analysis of the characteristics of object elements belonging to a tectonic fault.

The NEAREST function is intended to determine the object that has the closest spatial coordinates to the given ones. The function returns an affirmative answer if the coordinates of the object fall within the specified neighborhood. The function looks like this:

nearest(<условие>,<координаты>,<допуск>), (24)

where is the parameter<условие>has the already described meaning, parameter<координаты>describes the spatial characteristics of the anchor point, parameter<допуск>specifies the permissible distance in spatial coordinates from the specified point.

PVF can only be used in the IF parts of ES rules and control formulas. Since all PVFs have a logical type output, one-time nesting of different PVFs into each other is allowed, that is, queries of the form

neighboring (similar (<условие>,<доля1>,<параметры_сходства>),<доля2>). (25)

In this case, the BID driver generates a request, according to which elements that satisfy the innermost PVF are first selected, then those that satisfy the more external one are selected, etc. The characteristics of the selected elements are rewritten into the database (this information is used in the explanation mode), the interpreter calculates the output value of the PVF, which is entered into the rule base. Nested queries are of greatest interest because

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

A.Ya. Friedman

allow, by combining PVFs, to jointly evaluate the spatial and temporal characteristics of the object under study.

The PVFs described above provide analysis of a fairly wide class

spatio-temporal relationships between the characteristics of the elements of the object of examination, however, depending on the specifics of the subject area, it is possible to develop other PVFs.

In contrast to the rules generated when generalizing situations by their types, the generalization rules of the group considered here do not apply to the situation as a whole, but to individual objects, processes, or even SCM resources. To PVF slots<условие>

And<параметры_сходства>you can include logical conditions and various characteristics of SCM elements, including types and categories of these elements. The SMS does not provide automatic procedures for generating such rules; they are constructed by the user, and the probabilities in them are recalculated during classification in the same way as stated above.

Conclusion

Based on the introduced formal definitions of various types of situations that arise when modeling an LMS, its hierarchical model has been developed, including: a formal system - SCM and an integrated system with it - with many basic elements (7)-(10), a set of syntactic rules for generating some SCM elements others in the form of relations of type (7), (8), a system of axioms (15), (16) and inference rules (17), (18), as well as rules for changing the components of this formal system depending on the purposes of modeling and the prevailing situation at the object studies of the situation, specified by selecting the appropriate fragments of the SCM and controlling the output in the ES SCM. SCM refers to semiotic (sign) models, since it develops three groups of logical transformation rules - replenishment, classification and generalization of situations.

The differences of the proposed model are the integration of tools focused on the study of LMS, which provides joint logical and analytical data processing and situational analysis of the state of the object under study using expert knowledge and taking into account spatio-temporal dependencies in the characteristics of LMS, performed using cartographic information.

LITERATURE

1. Kuzmin I.A., Putilov V.A., Filchakov V.V. Distributed information processing in scientific research. L.: Nauka, 1991. 304 p. 2. Tsikritzis D., Lokhovsky F. Data models. M.: Finance and Statistics, 1985. 420 p. 3. Samarsky A.A. Introduction to numerical methods. M.: Nauka, 1987. 288 p. 4. Brzhezovsky A.V., Filchakov V.V. Conceptual analysis of computing systems. St. Petersburg: LIAP, 1991. 78 p. 5. Fridman A.Ya. Situational management of the structure of industrial-natural systems. Methods and models. Saarbrucken, Germany: LAP LAMBERT Academic Publishing, 2015. 530 pp. 6. Pospelov D.A. Situational management: theory and practice. M.: Nauka, 1986. 288 p. 7. Mitchell E. ESRI Guide to GIS Analysis. 1999. T. 1. 190 p.

8. Conceptual modeling of information systems / ed. V.V. Filchakova. St. Petersburg: SPVURE PVO, 1998. 356 p. 9. Automatic generation of hypotheses in intelligent systems / comp. E.S. Pankratova, V.K. Finn. M.: LIBROKOM, 2009. 528 p. 10. Darwiche A. Modeling and Reasoning with Bayesian Networks. Cambridge University Press, 2009. 526 p.

Fridman Alexander Yakovlevich - Doctor of Technical Sciences, Professor, Leading Researcher at the Institute of Informatics and Mathematical Modeling of the KSC RAS; e-mail: fridman@iimm. kolasc.net.ru

BULLETIN of the Kola Scientific Center RAS 4/2015(23)

Spatial integration of individual elements of a technical object is a widespread design task in any branch of technology: radio electronics, mechanical engineering, energy, etc. A significant part of spatial modeling is the visualization of individual elements and the technical object as a whole. Of great interest are the issues of constructing a database of graphical three-dimensional models of elements, algorithms and software implementation of graphical applications to solve this problem.

The construction of models of elements is universal in nature and can be considered as an invariant part of many systems of spatial modeling and computer-aided design of technical objects.

Regardless of the capabilities of the graphical environment used, according to the nature of the formation of graphical models, three groups of elements can be distinguished:

1.Unique elements, the configuration and dimensions of which are not repeated in other similar parts.

2. Unified elements, including a certain set of configuration fragments characteristic of parts of a given class. As a rule, there is a limited number of standard sizes of a unified element.

3. Composite elements, including both unique and unified elements in an arbitrary set. The graphical tools used may allow some nesting of constituent elements.

Spatial modeling of unique elements is not very difficult. Direct generation of the model configuration is performed interactively, after which the software implementation is designed based on the model generation protocol or a text description of the resulting element.

2. Alternately selecting fragments of the spatial configuration and determining their sizes;

3. Linking the graphical model of an element to other elements, technical objects or systems;

4.Input of additional information about the modeled element

This approach to creating models of unified elements ensures reliable software implementation.

The composite element model consists of a set of models of both unique and unified elements. Procedurally, a model of a composite element is built similarly to a model of a unified element, in which ready-made models of elements act as graphic fragments. The main features are the method of mutual binding of included models and the mechanics of combining individual fragments into a composite element. The latter is determined mainly by the capabilities of graphical tools.

Integration of a graphical environment and a database management system (DBMS) of technical information ensures the openness of the modeling system for solving other design problems: preliminary design calculations, selection of element base, preparation of design documentation (text and graphic), etc. The structure of the database (DB) is defined as requirements of graphic models and information needs of related tasks. It is possible to use any DBMS that is interfaced with a graphical environment as tools. The most general character is the construction of models of unified elements. At the first stage, as a result of systematization of the nomenclature of elements that are of the same type in purpose and composition of graphic fragments, a hypothetical one is formed or an existing sample of a modeled element is selected, which has a full set of modeled parts of the object.

Methods of interpolation from discretely located points.

The general problem of interpolation by points is formulated as follows: given a number of points (interpolation nodes), the position and values of the characteristics in which are known, it is necessary to determine the values of the characteristics for other points for which only the position is known. At the same time, there are methods of global and local interpolation, and among them are exact and approximating.

Global interpolation uses a single calculation function for the entire area simultaneously z = F(x,y) . In this case, changing one value (x, y) at the input affects the entire resulting DEM. In local interpolation, a calculation algorithm is repeatedly used for some samples from a common set of points, usually closely located. Then changing the choice of points affects only the results of processing a small area of the territory. Global interpolation algorithms produce smooth surfaces with few sharp edges; they are used in cases where the shape of the surface, such as a trend, is presumably known. When a large portion of the total data set is included in the local interpolation process, it essentially becomes global.

Accurate interpolation methods.

Accurate Interpolation Methods reproduce data at points (nodes) on which interpolation is based, and the surface passes through all points with known values. neighborhood analysis, in which all values of the simulated characteristics are taken equal to the values at the nearest known point. As a result, Thiessen polygons are formed with a sharp change in values at the boundaries. This method is used in environmental studies, when assessing impact zones, and is more suitable for nominal data.

In method B-splines construct a piecewise linear polynomial that allows you to create a series of segments that ultimately form a surface with continuous first and second derivatives. The method ensures continuity of heights, slopes, and curvature. The resulting DEM is in raster form. This local interpolation method is used mainly for smooth surfaces and is not suitable for surfaces with distinct changes - this leads to sharp fluctuations in the spline. It is widely used in programs for interpolating general purpose surfaces and smoothing contours when drawing them.

In TIN models, the surface within each triangle is usually represented as a plane. Since for each triangle it is specified by the heights of its three vertices, then in a common mosaic surface the triangles for adjacent areas exactly adjoin the sides: the resulting surface is continuous. However, if horizontal lines are drawn on the surface, then in this case they will be rectilinear and parallel within the triangles, and at the boundaries there will be a sharp change in their direction. Therefore, for some TIN applications, a mathematical surface is constructed within each triangle, characterized by a smooth change in slope angles at the boundaries of the triangles. Trend analysis. The surface is approximated by a polynomial and the output data structure is an algebraic function that can be used to calculate values at raster points or at any point on the surface. Linear equation, for example, z = a + bx + su describes an inclined flat surface, and the quadratic z = a + bx + cy + dx2 + yahoo + fy2 -a simple hill or valley. Generally speaking, any section of the surface th has no more order (T - 1) alternating highs and lows. For example, a cubic surface can have one maximum and one minimum in any section. Significant edge effects are possible because the polynomial model produces a convex surface.

Moving average and distance weighted average methods are most widely used, especially for modeling smoothly changing surfaces. The interpolated values represent the average of the values for n known points, or the average obtained from interpolated points, and in the general case are usually represented by the formula

Approximation interpolation methods.

Approximation interpolation methods are used in cases where there is some uncertainty regarding the available surface data; They are based on the consideration that many data sets show a slowly changing surface trend, overlaid with local, rapidly changing biases that lead to inaccuracies or errors in the data. In such cases, smoothing due to surface approximation makes it possible to reduce the influence of erroneous data on the nature of the resulting surface.

Methods of interpolation by area.

Interpolation by area involves transferring data from one source set of areas (key) to another set (target) and is often used when zoning a territory. If the target habitats are a grouping of key habitats, this is easy to do. Difficulties arise if the boundaries of the target areas are not related to the original key areas.

Let's consider two options for interpolation by area: in the first of them, as a result of interpolation, the total value of the interpolated indicator (for example, population size) of the target areas is not fully preserved, in the second, it is preserved.

Let's imagine that we have population data for some regions with given boundaries, and they need to be extended to a smaller zoning grid, the boundaries of which generally do not coincide with the first.

The technique is as follows. For each source area (key area), population density is calculated by dividing the total number of residents by the area of the site and assigning the resulting value to the central point (centroid). Based on this set of points, a regular grid is interpolated using one of the methods described above, and the population size is determined for each grid cell by multiplying the calculated density by the cell area. The interpolated grid is superimposed on the final map, the values in each cell refer to the boundaries of the corresponding target area. The total population of each of the resulting areas is then calculated.

The disadvantages of the method include the not entirely clear choice of the central point; Point-by-point interpolation methods are inadequate, and most importantly, the total value of the interpolated indicator of key areas (in this case, the total population of census zones) is not preserved. For example, if the source zone is divided into two target zones, then the total population in them after interpolation will not necessarily be equal to the population of the source zone.

In the second version of interpolation, methods of GIS overlay technology or construction of a smooth surface based on the so-called adaptive interpolation are used.

In the first method, the key and target areas are superimposed, the share of each of the source areas in the target areas is determined, the indicator values of each source area are divided proportionally to the areas of its areas in different target areas. It is believed that the density of the indicator within each area is the same, for example, if the indicator is the total population of the area, then the population density is considered a constant value for it.

The purpose of the second method is to create a smooth surface without ledges (attribute values should not change sharply at the boundaries of areas) and maintain the total value of the indicator within each area. His technique is as follows. A dense raster is superimposed on the cartogram representing key areas, the total value of the indicator for each area is equally divided between the raster cells overlapping it, the values are smoothed by replacing the value for each raster cell with the average for the neighborhood (over a window of 2 × 2, 3 × 3, 5 ×5) and sum the values for all cells of each area. Next, the values for all cells are adjusted proportionally so that the total value of the indicator for the area coincides with the original one (for example, if the sum is 10% less than the original value, the values for each cell increase by 10%). The process is repeated until... changes will stop.

For the described method, homogeneity within areas is not necessary, but too strong variations in the indicator within their limits can affect the quality of interpolation.

The results can be represented on the map by contours or continuous halftones.

Application of the method requires setting some boundary conditions, since along the periphery of the original areas, raster elements may extend beyond the study area or be adjacent to areas that do not have the value of the interpolated indicator. You can, for example, set the population density to 0 (lake, etc.) or set it equal to the values of the outermost cells in the study area.

When interpolating by area, very complex cases can arise, for example, when you need to create a map showing “areas of settlement” based on population data for individual cities, especially if these areas are shown as a dot at the scale of the map. The problem also occurs for small source areas when there are no boundary files and the data only indicates the location of the center point. Here, different approaches are possible: replacing the points to which the data is assigned with circles, the radius of which is estimated by the distances to neighboring centroids; determining the threshold population density for classifying an area as urban; distribution of the population of each city over its territory so that in the center the population density is higher, and towards the outskirts it decreases; At points with a threshold value of the indicator, lines are drawn that limit populated areas.

Often attempting to create a continuous surface using area interpolation from point-only data will produce incorrect results.

The user usually evaluates the success of the method subjectively and mainly visually. Until now, many researchers use manual interpolation or interpolation “by eye” (this method is usually not highly regarded by geographers and cartographers, but is widely used by geologists). Currently, attempts are being made to “extract” the knowledge of experts using methods for creating knowledge bases and introducing them into an expert system that performs interpolation.

Time series models characterizing the dependence of the resulting variable on time include:

a) a model of the dependence of the resulting variable on the trend component or a trend model;

b) result dependence model. variable from the seasonal component or seasonality model;

c) a model of the dependence of the resulting variable on the trend and seasonal components or a model of trend and seasonality.

If economic statements reflect the dynamic (time-dependent) relationship of the variables included in the model, then the values of such variables are dated and called dynamic or time series. If economic statements reflect a static (relating to one period of time) relationship of all variables included in the model, then the values of such variables are usually called spatial data. And there is no need to date them. Lagged are exogenous or endogenous variables of an economic model, dated to previous points in time and located in the equation with current variables. Models that include lagged variables belong to the class of dynamic models. Predestined called lagged and current exogenous variables, as well as lagged endogenous variables

23. Trend and spatiotemporal EM in economic planning

Statistical observations in socio-economic studies are usually carried out regularly at equal intervals of time and are presented in the form of time series xt, where t = 1, 2, ..., p. Trend regression models, the parameters of which are estimated, are used as a tool for statistical forecasting of time series according to the available statistical base, and then the main tendencies (trends) are extrapolated to a given time interval.

Statistical forecasting methodology involves building and testing many models for each time series, comparing them based on statistical criteria, and selecting the best ones for forecasting.

When modeling seasonal phenomena in statistical studies, two types of fluctuations are distinguished: multiplicative and additive. In the multiplicative case, the range of seasonal fluctuations changes over time in proportion to the trend level and is reflected in the statistical model by a multiplier. With additive seasonality, it is assumed that the amplitude of seasonal deviations is constant and does not depend on the trend level, and the fluctuations themselves are represented in the model by a term.

The basis of most forecasting methods is extrapolation, associated with the dissemination of patterns, connections and relationships operating in the period under study beyond its borders, or - in a broader sense of the word - it is obtaining ideas about the future based on information related to the past and present.

The most famous and widely used are trend and adaptive forecasting methods. Among the latter, one can highlight methods such as autoregression, moving average (Box-Jenkins and adaptive filtering), exponential smoothing methods (Holt, Brown and exponential average), etc.

To assess the quality of the forecast model under study, several statistical criteria are used.

When presenting a set of observational results in the form of time series, the assumption is actually used that the observed values belong to a certain distribution, the parameters of which and their changes can be estimated. Using these parameters (usually the mean value and variance, although sometimes a more complete description is used) one can build one of the models for the probabilistic representation of the process. Another probabilistic representation is a model in the form of a frequency distribution with parameters pj for the relative frequency of observations falling into the jth interval. Moreover, if no change in the distribution is expected during the accepted lead time, then the decision is made on the basis of the existing empirical frequency distribution.

When making forecasts, it is necessary to keep in mind that all factors influencing the behavior of the system in the base (studied) and forecast periods must be constant or change according to a known law. The first case is implemented in single-factor forecasting, the second - in multi-factor forecasting.

Multifactor dynamic models must take into account spatial and temporal changes in factors (arguments), as well as (if necessary) the lag of the influence of these factors on the dependent variable (function). Multifactor forecasting makes it possible to take into account the development of interrelated processes and phenomena. Its basis is a systematic approach to the study of the phenomenon under study, as well as the process of understanding the phenomenon, both in the past and in the future.

In multifactor forecasting, one of the main problems is the problem of choosing factors that determine the behavior of the system, which cannot be solved purely statistically, but only through an in-depth study of the essence of the phenomenon. Here it is necessary to emphasize the primacy of analysis (comprehension) over purely statistical (mathematical) methods of studying the phenomenon. In traditional methods (for example, in the least squares method), observations are considered to be independent of each other (by the same argument). In reality, there is autocorrelation and its failure to take it into account leads to suboptimal statistical estimates and makes it difficult to construct confidence intervals for regression coefficients, as well as to test their significance. Autocorrelation is determined by deviations from trends. It can occur if the influence of a significant factor or several less significant factors, but directed “in one direction,” is not taken into account, or the model that establishes the connection between the factors and the function is incorrectly selected. To identify the presence of autocorrelation, the Durbin-Watson test is used. To eliminate or reduce autocorrelation, a transition to a random component (detrending) or introducing time into the multiple regression equation as an argument is used.

In multifactor models, the problem of multicollinearity also arises - the presence of a strong correlation between factors, which can exist regardless of any dependence between the function and the factors. By identifying which factors are multicollinear, it is possible to determine the nature of the interdependence between the multicollinear elements of a set of independent variables.

In multivariate analysis, it is necessary, along with estimating the parameters of the smoothing (studied) function, to construct a forecast for each factor (based on some other functions or models). Naturally, the values of factors obtained in the experiment in the base period do not coincide with similar values found using predictive models for factors. This difference must be explained either by random deviations, the magnitude of which is revealed by the indicated differences and should be taken into account immediately when estimating the parameters of the smoothing function, or this difference is not random and no prediction can be made. That is, in a multifactor forecasting problem, the initial values of the factors, as well as the values of the smoothing function, must be taken with the corresponding errors, the distribution law of which must be determined by appropriate analysis preceding the forecasting procedure.

24. Essence and content of EM: structural and expanded

Econometric models are systems of interconnected equations, many of whose parameters are determined by methods of statistical data processing. To date, many hundreds of econometric systems have been developed and used abroad for analytical and forecasting purposes. Macroeconometric models, as a rule, are first presented in a natural, meaningful form, and then in a reduced, structural form. The natural form of econometric equations allows us to qualify their content and assess their economic meaning.

To build forecasts of endogenous variables, it is necessary to express the current endogenous variables of the model as explicit functions of predefined variables. The last specification, obtained by including random disturbances, is obtained as a result of the mathematical formalization of economic laws. This form of specification is called structural. In general, in a structural specification, endogenous variables are not expressed explicitly through predetermined ones.

In the equilibrium market model, only the supply variable is expressed explicitly through a predefined variable, so to represent endogenous variables through predefined ones, it is necessary to perform some transformations of the structural form. Let us solve the system of equations for the latter specification with respect to endogenous variables.

Thus, the endogenous variables of the model are expressed explicitly through predefined variables. This form of specification is called given. In a particular case, the structural and reduced forms of the model may coincide. With the correct specification of the model, the transition from the structural to the reduced form is always possible, but the reverse transition is not always possible.

A system of joint, simultaneous equations (or structural form of a model) typically contains endogenous and exogenous variables. Endogenous variables are denoted in the system of simultaneous equations presented earlier as y. These are dependent variables, the number of which is equal to the number of equations in the system. Exogenous variables are usually denoted as x. These are predetermined variables that influence but are not dependent on endogenous variables.

The simplest structural form of the model is:

where y are endogenous variables; x – exogenous variables.

The classification of variables into endogenous and exogenous depends on the theoretical concept of the adopted model. Economic variables can act as endogenous variables in some models and as exogenous variables in others. Non-economic variables (for example, climatic conditions) enter the system as exogenous variables. The values of endogenous variables for the previous period of time (lagged variables) can be considered as exogenous variables.

Thus, the current year’s consumption (y t) may depend not only on a number of economic factors, but also on the level of consumption in the previous year (y t-1)

The structural form of the model allows you to see the impact of changes in any exogenous variable on the values of the endogenous variable. It is advisable to select as exogenous variables those variables that can be the object of regulation. By changing and managing them, it is possible to have target values of endogenous variables in advance.

The structural form of the model on the right side contains the coefficients b i and a j for endogenous and exogenous variables (b i is the coefficient for the endogenous variable, a j is the coefficient for the exogenous variable), which are called the structural coefficients of the model. All variables in the model are expressed in deviations from the level, i.e. by x we mean x- (and by y we mean y- (). Therefore, there is no free term in each equation of the system.

Using OLS to estimate the structural coefficients of the model gives, as is commonly believed in theory, biased structural coefficients of the model, structural coefficients of the model, the structural form of the model is transformed into the reduced form of the model.

The reduced form of the model is a system of linear functions of endogenous variables from exogenous ones:

In its appearance, the reduced form of the model is no different from a system of independent equations, the parameters of which are estimated by traditional least squares methods. Using OLS, one can estimate δ and then estimate the values of endogenous variables through exogenous ones.

Deployed EV(her blocks)

Until recently, geographical factors that have a significant impact on the spread of diseases have been studied relatively little. The validity of the assumption of homogeneous mixing of the population in a small town or village has long been questioned, although it is quite acceptable as a first approximation to accept that the movements of sources of infection are random and in many ways resemble the movement of particles in a colloidal solution. It is, of course, necessary, of course, to have some idea of the effect that the presence of large numbers of susceptible individuals at points at fairly large distances from any given source of infection may have.

In the deterministic model, due to D. Kendall, the existence of an infinite two-dimensional continuum of a population is assumed, in which there are about individuals per unit area. Consider the area surrounding point P and assume that the numbers of susceptible, infected, and removed individuals from the collective are equal, respectively. The quantities x, y and z can be functions of time and position, but their sum must be equal to unity. The basic equations of motion, similar to system (9.18), have the form

where is the spatially weighted average

Let and be constants, be the area element surrounding the point Q, and be a non-negative weighting coefficient.

Let us assume that the initial concentration of diseases is uniformly distributed in some small area surrounding the initial focus. Note also that the multiplier o is explicitly introduced into the Rohu product so that the rate of infection spread remains independent of population density. If y remained constant on the plane, then integral (9.53) would certainly converge. In this case it would be convenient to require that

The described model allows one to advance mathematical research quite far. It can be shown (with one or two caveats) that a pandemic will cover the entire plane if and only if the population density exceeds a threshold value. If a pandemic has occurred, then its intensity is determined by the only positive root of the equation

The meaning of this expression is that the proportion of individuals who eventually become ill in any area, no matter how far it is from the original epidemic focus, will be no less? Obviously, this Kendall pandemic threshold theorem is similar to the Kermack and McKendrick threshold theorem, in which the spatial factor was not taken into account.

You can also build a model for the following special case. Let x and y be the spatial densities of susceptible and infected individuals, respectively. If we consider the infection to be local and isotropic, then it is easy to show that the equations corresponding to the first two equations of system (9.18) can be written in the form

where are not spatial coordinates] and

For the initial period, when it can be approximately considered a constant value, the second equation of system (9.56) will take the form

This is the standard diffusion equation, the solution of which is

where the constant C depends on the initial conditions.

The total number of infected individuals located outside the circle of radius R is equal to

Hence,

and if , then . The radius corresponding to any selected value grows at a rate of . This value can be considered as the speed of spread of the epidemic, and its limiting value for large t is equal to . In one measles epidemic in Glasgow, the rate of spread was about 135 m per week for almost six months.

Equations (9.56) can easily be modified to take into account the migration of susceptible and infected individuals, as well as the emergence of new susceptible individuals. As in the case of recurring epidemics discussed in Sect. 9.4, an equilibrium solution is possible here, but small oscillations decay as quickly or even faster than in the non-spatial model. Thus, it is clear that in this case the deterministic approach has certain limitations. In principle, one should, of course, prefer stochastic models, but usually their analysis is fraught with enormous difficulties, at least if it is carried out in a purely mathematical way.

Several studies have been carried out to model these processes. Thus, Bartlett used a computer to study several successive artificial epidemics. The spatial factor was taken into account by introducing a grid of cells. Within each cell, typical non-spatial models for continuous or discrete time were used and random migration of infected individuals was allowed between cells sharing a common boundary. Information was obtained about the critical volume of the population, below which the epidemic process attenuates. The main parameters of the model were obtained based on actual epidemiological and demographic data.

Recently, the author of this book has undertaken a number of similar studies in which an attempt was made to construct a spatial generalization of the stochastic models for the simple and general cases considered in Sect. 9.2 and 9.3. Let us assume that there is a square lattice, each node of which is occupied by one susceptible individual. The source of infection is placed in the center of the square and a process of chain-binomial type is considered for discrete time, in which only individuals directly adjacent to any source of infection are exposed to the danger of infection. These can be either only the four nearest neighbors (Scheme 1), or also individuals located diagonally (Scheme 2); in the second case there will be a total of eight individuals lying on the sides of a square, the center of which is occupied by the source of infection.

Obviously, the choice of scheme is arbitrary, but in our work the latter arrangement was used.

First, a simple epidemic without cases of recovery was considered. For convenience, a grid of limited size was used, and information about the condition of each individual (i.e., whether he was susceptible to or a source of infection) was stored in the computer. During the simulation, a running record of changes in the condition of all individuals was carried out and the total number of new cases of the disease was calculated in all squares with the original source of infection in the center. The current values of the sum and sum of squares of the number of cases were also recorded in the machine's memory. This made it fairly easy to calculate the means and standard errors. The details of this research will be published in a separate article, but here we will note only one or two particular features of this work. For example, it is clear that with a very high probability of sufficient contact there will be an almost deterministic spread of the epidemic, in which at each new stage of the epidemic a new square with sources of infection will be added.

At lower probabilities, a truly stochastic spread of the epidemic will occur. Since each source of infection can infect only eight of its immediate neighbors, and not the entire population, we can expect that the epidemic curve for the entire lattice will not increase as sharply as if the entire population were homogeneously mixed. This prediction does indeed come true, and the number of new cases increases more or less linearly over time until edge effects begin to take effect (since the grid has a limited extent).

Table 9. Spatial stochastic model of a simple epidemic, built on a 21x21 lattice

In table Figure 9 shows the results obtained for the grid in the presence of one initial source of infection and the probability of sufficient contact equal to 0.6. It can be seen that between the first and tenth stages of the epidemic, the average number of new cases increases by approximately 7.5 each time. After this, the edge effect begins to predominate, and the epidemic curve drops sharply.

One can also determine the average number of new cases for any given grid point and thereby find the epidemic curve for that point. It is convenient to average over all points lying on the boundary of the square, in the center of which the source of infection is located, although the symmetry in this case will not be complete. Comparing the results for squares of different sizes gives a picture of an epidemic wave moving from the original source of infection.

Here we have a sequence of distributions whose modes increase in a linear progression and whose dispersion continuously increases.

A more detailed study of the general epidemic was also carried out, removing infected individuals. Of course, these are all very simplified models. However, it is important to understand that they can be greatly improved. To take into account the mobility of the population, it must be assumed that susceptible individuals become infected from sources of infection that are not their closest neighbors. You may have to use some kind of distance-based weighting here. The modifications that will need to be introduced into the computer program are relatively small. At the next stage, it may be possible to describe in this way real or typical populations with the most diverse structure. This will open up the opportunity to assess the epidemiological state of real populations from the point of view of the risk of epidemics of various types.