• Object-oriented data models. Object-oriented model

    Basic Concepts

    Definition 1

    Object-oriented model data presentation makes it possible to identify individual database records.

    Database records and their processing functions are linked by mechanisms similar to those implemented in object-oriented programming languages.

    Definition 2

    Graphical representation The structure of an object-oriented database is a tree whose nodes represent objects.

    Standard type (for example, string - string) or user-created type ( class), describes object properties.

    In Figure 1, the LIBRARY object is the parent of instance objects of the DIRECTORY, SUBSCRIBER, and ISSUE classes. Different objects of type BOOK can have the same or different parents. Objects of type BOOK that have the same parent must have at least different accession numbers (unique for each instance of the book), but same values properties author, Name, udk And isbn.

    The logical structures of an object-oriented and hierarchical database are superficially similar. They differ mainly in the methods of data manipulation.

    When performing actions on data in an object-oriented model, we use logical operations, which are enhanced by encapsulation, inheritance and polymorphism. With some limitations, you can use operations that are similar to SQL commands (for example, when creating a database).

    When creating and modifying a database, indexes (index tables) are automatically generated and subsequently adjusted, which contain information for implementing quick search data.

    Definition 3

    Target encapsulation– limiting the scope of the property name to the boundaries of the object in which it is defined.

    For example, if a property is added to the DIRECTORY object that specifies the author's phone number and has the name telephone, then the objects DIRECTORY and SUBSCRIBER will have the same properties. The meaning of a property is determined by the object in which it is encapsulated.

    Definition 4

    Inheritance, the inverse of encapsulation, is responsible for propagating the scope of a property relative to all descendants of the object.

    For example, all BOOK objects that are descendants of the DIRECTORY object can be assigned the properties of the parent object: author, Name, udk And isbn.

    If it is necessary to extend the inheritance mechanism to objects that are not immediate relatives (for example, to two descendants of one parent), an abstract property of the type is defined in their common ancestor abs.

    So the properties number And ticket in the LIBRARY object are inherited by all child objects ISSUE, BOOK and SUBSCRIBER. That is why the values ​​of this property of the SUBSCRIBER and ISSUING classes are the same - 00015 (Figure 1).

    Definition 5

    Polymorphism allows the same program code to work with different types of data.

    In other words, it allows in objects different types have methods (functions or procedures) with the same names.

    Search in an object-oriented database is to determine the similarity between the object that the user specifies and the objects that are stored in the database.

    Advantages and disadvantages of the object-oriented model

    Basics advantage An object-oriented data model, in contrast to a relational model, is the ability to display information about complex relationships between objects. The data model under consideration allows us to define a separate database record and the functions for processing it.

    TO shortcomings The object-oriented model is characterized by high conceptual complexity, inconvenient data processing and low query speed.

    Today, such systems are quite widespread. These include DBMS:

    • Postgres,
    • Orion,
    • Iris,
    • ODBJupiter,
    • Versant,
    • Objectivity/DB
    • ObjectStore
    • Statice,
    • GemStone
    • G-Base.

    Post-relationalmodel

    The classic relational model assumes the indivisibility of data stored in the fields of table records. The post-relational model is an extended relational model that removes the constraint of data indivisibility. The model allows multivalued fields—fields whose values ​​consist of subvalues. A set of multivalued field values ​​is considered a separate table, embedded in the main table.

    In Fig. 2.6, using the example of information about invoices and goods for comparison, shows the representation of the same data using relational (a) and post-relational (b) models. The figure shows that compared to the relational model in post relational model Data is stored more efficiently, and processing does not require a join operation between data from two tables.

    Invoices Invoices-goods

    N invoice

    Buyer

    N invoice

    Quantity

    Invoices

    N invoice

    Buyer

    Quantity

    Rice. 2.6. Data structures of relational and post-relational models

    Since the post-relational model allows non-normalized data to be stored in tables, the problem of ensuring data integrity and consistency arises. This problem is solved by including appropriate mechanisms in the DBMS.

    Dignity The post-relational model is the ability to represent a set of related relational tables by one post-relational table. This ensures high clarity of information presentation and increases the efficiency of its processing.

    Disadvantage The post-relational model is the difficulty of solving the problem of ensuring the integrity and consistency of stored data.

    The considered relational data model is supported by the uniVers DBMS. Other DBMSs based on a post-relational data model also include the Bubba and Dasdb systems.

    Multidimensional model

    The multidimensional approach to data presentation appeared almost simultaneously with the relational one, but interest in multidimensional DBMSs began to become widespread since the mid-90s. The impetus was an article by E. Codd in 1993. It formulated 12 basic requirements for OLAP (OnLine Analytical Processing) class systems, the most important of which are related to the capabilities of conceptual representation and processing of multidimensional data.

    In the development of concepts information systems The following two directions can be distinguished:

    Operational (transactional) processing systems;

    Systems analytical processing(decision support systems).

    Relational DBMSs were intended for information systems for online information processing and are very effective in this area. In analytical processing systems, they have proven to be somewhat clumsy and insufficiently flexible. Multidimensional DBMSs are more effective here.

    Multidimensional DBMSs are highly specialized DBMSs designed for interactive analytical processing of information. The main concepts used in these DBMSs are aggregability, historicity and predictability.

    Aggregability data means considering information at various levels of its generalization. In information systems, the degree of detail in the presentation of information to the user depends on his level: analyst, user, manager, manager.

    Historicity Data management involves ensuring a high level of staticity of the data itself and their relationships, as well as the obligatory binding of data to time.

    Predictability Data processing involves specifying forecasting functions and applying them to different time intervals.

    Multidimensionality of a data model does not mean multidimensional visualization of digital data, but a multidimensional logical representation of the structure of information in the description and in data manipulation operations.

    Compared to the relational model, the multidimensional organization of data is more visual and informative. For illustration in Fig. Figure 2.7 shows relational (a) and multidimensional (b) representations of the same data on car sales volumes.

    Basic concepts of multidimensional data models: dimension and cell.

    Measurement is a set of data of the same type that forms one of the faces of a hypercube. In a multidimensional model, measurements play the role of indices that serve to identify specific values ​​in the cells of a hypercube.

    Cell is a field whose value is uniquely determined by a fixed set of measurements. The field type is most often defined as digital. Depending on how the values ​​of a cell are generated, it can be a variable (the values ​​change and can be loaded from an external data source or generated programmatically) or a formula (the values, like formula cells in spreadsheets, are calculated using predefined formulas).

    Rice. 2.7. Relational and multidimensional representation of data

    In the example in Fig. 2.7 b each cell value Sales volume uniquely determined by a combination of time dimensions Sales month and car models. In practice, more measurements are often required. Example three-dimensional model data is shown in Fig. 2.8.

    Rice. 2.8. Example of a 3D model

    Existing multidimensional DBMSs use two main data organization schemes: hypercubic and polycubic.

    IN polycubic The scheme assumes that several hypercubes with different dimensions and with different dimensions as faces can be defined in the database. An example of a system that supports a polycubic version of a database is Oracle Express Server.

    In case hypercubic The design assumes that all cells are defined by the same set of measurements. This means that if there are several hypercubes in the database, they all have the same dimension and the same dimensions.

    Main dignity multidimensional data model is the convenience and efficiency of analytical processing of large volumes of time-related data.

    Disadvantage multidimensional data model is its cumbersomeness for the simplest tasks of ordinary operational processing information.

    Examples of systems that support multidimensional data models are Essbase, Media Multi-matrix, Oracle Express Server, Cache. There are software products, for example Media / MR, that allow you to simultaneously work with multidimensional and relational databases.

    Object-oriented model

    In an object-oriented model, when presenting data, it is possible to identify individual database records. Relationships are established between records and their processing functions using mechanisms similar to the corresponding tools in object-oriented programming languages.

    The standardized object-oriented model is described in the recommendations of the ODMG-93 (Object Database Management Group) standard.

    Let's consider a simplified model of an object-oriented database. The structure of an object-oriented database is graphically represented as a tree, the nodes of which are objects. The properties of objects are described by some standard type or a user-constructed type (defined as a class). The value of a property of type class is an object that is an instance of the corresponding class. Each object instance of a class is considered a child of the object in which it is defined as a property. An instance object of a class belongs to its class and has one parent. Generic relationships in the database form a coherent hierarchy of objects. An example of the logical structure of an object-oriented library science database is shown in Fig. 2.9. Here is an object of type Library is the parent of class instance objects Subscriber, Catalog And Issue. Various type objects Books and may have the same or different parents. Objects of type Book, having the same parent, must differ in at least the accession number (unique for each copy of the book), but have the same property values isb n, udk, name e and author.

    The logical structure of an object-oriented database is superficially similar to the structure of a hierarchical database. The main difference between them is the data manipulation methods.

    To perform actions on data in the database model under consideration, logical operations are used, enhanced by object-oriented mechanisms of encapsulation, inheritance and polymorphism.

    Encapsulation limits the scope of a property name to the boundaries of the object in which it is defined. So, if in an object of type Catalog add a property that specifies the phone number of the author of the book and has a title telephone, then we will get properties of the same name for objects Subscriber And Catalog. The meaning of such a property will be determined by the object in which it is encapsulated.

    Inheritance, on the contrary, extends the scope of the property to all descendants of the object. So, all objects of type Book, which are descendants of an object of type Catalog, you can assign properties to the parent object: isbn, udk, Name And author. If it is necessary to extend the inheritance mechanism to objects that are not immediate relatives (for example, between two children of the same parent), then an abstract property of the type is defined in their common ancestor abs. Thus, the definition of abstract properties ticket And number in the object Library causes these properties to be inherited by all seven child objects Subscriber, Book And Issues A. Not by chance, so the property values ticket classes Subscriber And Issue, shown in Fig. 2.9 are the same – 00015.

    Polymorphism in object-oriented programming languages ​​means the ability of the same program code to work with different types of data. In other words, it means that it is permissible for objects of different types to have methods (procedures or functions) with the same names. During execution of an object program, the same methods operate on different objects depending on the type of the argument. In relation to the example under consideration, polymorphism means that objects of the class Book having different parents from the class Catalog, may have a different set of properties. Therefore, programs for working with class objects Book may contain polymorphic code.

    Searching in an object-oriented database involves finding similarities between an object specified by the user and objects stored in the database.

    Rice. 2.9. Logical structure of the library science database

    Main dignity An object-oriented data model compared to a relational one is the ability to display information about complex relationships between objects. An object-oriented data model allows you to identify individual database records and define functions for processing them.

    Disadvantages object-oriented model are high conceptual complexity, inconvenience of data processing and low speed fulfilling requests.

    Object-oriented DBMSs include POET, Jasmine, Versant, O 2, ODB - Jupiter, Iris, Orion, Postgres.

    Object-oriented model

    In an object-oriented model, when presenting data, it is possible to identify individual database records. Relationships are established between database records and their processing functions using mechanisms similar to the corresponding facilities in object-oriented programming languages.

    The standardized object-oriented model is described in the recommendations of the ODMG-93 (Object Database Management Group) standard. It has not yet been possible to fully implement the recommendations of ODMG-93. For illustration key ideas Let's consider a slightly simplified model of an object-oriented database.

    The structure of an object-oriented database (for example, Versant Object Database, Object Store, etc.) is graphically represented as a tree, the nodes of which are objects. The properties of objects are described by some standard type (for example, string) or a user-constructed type (defined as class).

    The value of a property of type string is a string of characters. The value of a property of type class is an object that is an instance of the corresponding class. Each object - an instance of a class is considered a descendant of the object in which it is defined as a property. An object is an instance of a class that belongs to its class and has one parent. Generic relationships in the database form a coherent hierarchy of objects.

    The logical structure of an object-oriented database is superficially similar to the structure of a hierarchical database. The main difference between them is the methods of data manipulation.

    To perform actions on data in the database model under consideration, logical operations are used, enhanced by object-oriented mechanisms of encapsulation, inheritance and polymorphism.

    Operations like commands can be used to a limited extent SQL language(for example, to create a database).

    The creation and modification of a database is accompanied by the automatic formation and subsequent adjustment of indexes (index tables) containing information for quick data retrieval.

    Let's briefly consider the concepts of encapsulation, inheritance and polymorphism in relation to the object-oriented database model.

    Encapsulation limits the scope of a property name to the boundaries of the object in which it is defined.

    Inheritance, on the contrary, extends the scope of a property to all descendants of the object.

    Polymorphism in object-oriented programming languages ​​means the ability of the same program code to work with different types of data. In other words, it means that it is permissible for objects of different types to have methods (procedures or functions) with the same names. During execution of an object program, the same methods operate on different objects depending on the type of the argument. Searching in an object-oriented database involves finding similarities between an object specified by the user and objects stored in the database. A user-defined object, called a goal object (the object's property is of type goal), can generally be a subset of the entire hierarchy of objects stored in the database. The target object, as well as the result of the query, can be stored in the database itself.

    The main advantage of an object-oriented data model in comparison with a relational one is the ability to display information about complex relationships between objects. An object-oriented data model allows you to identify individual database records and define functions for processing them.

    The disadvantages of the object-oriented model are high conceptual complexity, inconvenient data processing and low query speed.

    Data Types

    Initially, DBMSs were used primarily to solve financial and economic problems. In this case, regardless of the presentation model, the following main data types were used in databases:

    • numeric. Examples of data values: 0.43; 328; 2E+5;
    • symbolic (alphanumeric). Examples of data values: "Friday", "string", "programmer";
    • dates, specified using a special Date type or as regular character data. Examples of data values: 12/1/97, 23/2/1999.

    In different DBMSs, these types could differ slightly from each other in name, range of values, and type of representation. Subsequently, specialized data processing systems began to appear in new areas of application, such as geoinformation systems, video image processing, etc. In this regard, developers began to introduce new data types into traditional DBMSs. Relatively new data types include the following:

    • temporary and date-temporal, designed to store information about time and (or) date. Examples of data values: 01/31/85 (date), 9:10:03 (time), 03/6/1960 12:00 (date and time);
    • variable-length characters intended to store text information long length, for example a document;
    • binary, intended for storage graphic objects, audio and video information, spatial, chronological and other special information. For example, in MS Access this type is the “OLE Object Field” data type, which allows you to store graphic data in the BMP (Bitmap) format in the database and automatically display it when working with the database;
    • hyperlinks, designed to store links to various resources (nodes, files, documents, etc.) located outside the database, for example on the Internet, corporate network intranet or on your computer hard drive.

    In modern DBMSs with various models all listed data types can be used.

    In an object-oriented model (OOM), when presenting data, it is possible to identify individual database records. Relationships are established between database records and their processing functions using mechanisms similar to the corresponding facilities in object-oriented programming languages.

    Standard OOM described in the recommendations of the ODMG-93 standard (Object Database Management Group - object-oriented database management group). It has not yet been possible to fully implement the recommendations of ODMG-93. To illustrate the key ideas, consider a somewhat simplified model of an object-oriented database.

    The structure of an object-oriented database can be graphically represented as a tree, the nodes of which are objects. The properties of objects are described by some standard type (for example, string) or a user-constructed type (defined as class).

    The value of a property of type string is a string of characters. The value of a property of type class is an object that is an instance of the corresponding class. Each object instance of a class is considered a child of the object in which it is defined as a property. An instance object of a class belongs to its class and has one parent. Generic relationships in the database form a connected hierarchy of objects.

    An example of the logical structure of a librarianship OO database is shown in Fig. 3.14. Here, an object of type LIBRARY is the parent of instance objects of the SUBSCRIBER, DIRECTORY and ISSUE classes. Different objects of type BOOK that have the same parent must differ in at least the accession number (unique for each instance of the book), but have the same property values isbn, udk, title And author.


    Fig.3.14. Logical structure of a library science database

    The logical structure of an object-oriented database is superficially similar to the structure of a hierarchical database. The main difference between them is the methods of data manipulation. To perform actions on data in an OOM database, logical operations are used, enhanced by object-oriented mechanisms of encapsulation, inheritance and polymorphism. Operations similar to SQL commands can be used to a limited extent (for example, to create a database).

    The creation and modification of a database is accompanied by the automatic formation and subsequent adjustment of indexes (index tables) containing information for quick data retrieval.

    Let's briefly consider the concepts of encapsulation, inheritance and polymorphism in relation to OOM databases.

    Encapsulation limits the scope of a property name to the boundaries of the object in which it is defined. So, if you add a property to an object of type DIRECTORY that specifies the phone number of the author of the book and has the name telephone, then we will get properties of the same name for the SUBSCRIBER and DIRECTORY objects. The meaning of such a property will be determined by the object in which it is encapsulated.

    Inheritance, on the contrary, it extends the scope of the property to all descendants of the object. Thus, all objects of the BOOK type that are descendants of an object of the DIRECTORY type can be assigned the properties of the parent object: isbn, udk, title And author. If it is necessary to extend the inheritance mechanism to objects that are not immediate relatives (for example, between two children of the same parent), then an abstract property of type abs is defined in their common ancestor. Thus, the definition of abstract properties ticket And number in the LIBRARY object causes these properties to be inherited by all child objects SUBSCRIBER, BOOK and ISSUE. It is no coincidence that the property values ticket classes SUBSCRIBER and ISSUING shown in the figure will be the same - 00015.

    Polymorphism in object-oriented programming languages ​​means the ability of the same program code to work with different types of data. In other words, it means that it is permissible for objects of different types to have methods (procedures or functions) with the same names. During execution of an object program, the same methods operate on different objects depending on the type of the argument. In relation to our object-oriented database, polymorphism means that objects of the BOOK class that have different parents from the DIRECTORY class can have a different set of properties. Consequently, programs for working with objects of the BOOK class can contain polymorphic code.

    Searching in an object-oriented database consists of finding out the similarities between an object specified by the user and objects stored in the database. A user-defined object, called a goal object (the object's property is of type goal), can generally be a subset of the entire hierarchy of objects stored in the database. The target object, as well as the result of the query, can be stored in the database itself. An example of a request for library card numbers and names of subscribers who received at least one book from the library is shown in Fig. 3.15.

    Main dignity OOM of data in comparison with relational is the ability to display information about complex relationships between objects. OOM of data allows you to identify an individual database record and determine the functions for processing them.

    Disadvantage OOMs are characterized by high conceptual complexity, inconvenience of data processing and low speed of query execution.



    Fig.3.15. Database fragment with target object

    Let us turn again to the Orders task, presented in the form of a relational data model in Fig. 3.8, and consider it in terms of an object-oriented database. There are three classes in the example: “ Clients», « Orders" And " Goods" Objects of the class " Clients» are specific clients; class properties - client number, client name City, Status, etc. Class methods - " Create an order», « Pay the bill"etc. A method is some operation that can be applied to an object; a method is what an object should do. Class corresponding to table " Order Details", not required. Table data can be part of the class " Orders" Availability in class " Clients"method" Create an order" leads to interaction with class objects " Orders" And " Goods" In this case, the user does not need to know about this interaction of objects. The user only accesses the object " Orders" and uses the method " Create an order" The fact of impact on other databases may be hidden from the user. If the method " Create an order", in turn, calls the method " Check the client's creditworthiness", then this fact can also be hidden from the user. IN relational databases ah data to perform the same functions requires writing procedures in the Visual language Basic for Application (VBA).

    In the 90s, there were experimental prototypes of OO database management systems. Currently, such systems are widespread. In particular, these include the following DBMS: POET (POET Software), Jasmine (Computer Associates), Versant (Versant Technologies), O2 (Ardent Software), ODB-Jupiter (Inteltek Plus Research and Production Center), as well as Iris , Orion and Postgres.

    The first formalized and generally accepted data model was the Codd relational model. In this model, as in all the following, three aspects were distinguished - structural, holistic and manipulative. Data structures in the relational model are based on flat normalized relationships, integrity constraints are expressed using first-order logic, and finally data manipulation is based on relational algebra or the equivalent relational calculus. As many researchers note, the relational data model owes much of its success to the fact that it relied on strict mathematical apparatus set theory, relations and first order logic. The developers of any given relational system considered it their duty to demonstrate consistency with their specific model data from a general relational model, which acted as a measure of the “relativity” of the system.

    The main difficulties of object-oriented data modeling stem from the fact that such a developed mathematical apparatus on which a general object-oriented data model could be based does not exist. This is largely why there is still no basic object-oriented model. On the other hand, some authors argue that a general object-oriented data model in the classical sense cannot be defined because the classical concept of a data model is unsuitable for the object-oriented paradigm.

    One of the most famous theorists in the field of data models, Beeri, proposes in general terms a formal framework for OODB, which is far from complete and is not a data model in the traditional sense, but allows researchers and developers of OODB systems to at least speak the same language (if, of course, the proposals Bears will be developed and supported). Regardless of the further fate of these proposals, we consider it useful to briefly recount them.

    First, following the practice of many OODBs, it is proposed to distinguish two levels of object modeling: lower (structural) and upper (behavioral). Supported at the structural level complex objects, their identification and types of "isa" communication. A database is a collection of data elements related by a “is a member of a class” or “is an attribute” relationship. Thus, the database can be considered as a directed graph. An important point is to maintain, along with the concept of an object, the concept of value (later we will see how much is built on this in one of the successful object-oriented DBMSs O2).



    An important aspect is the clear separation of the database schema and the database itself. The primary concepts of the OODB circuit level are types and classes. It is noted that in all systems that use only one concept (either a type or a class), this concept is inevitably overloaded: a type assumes the presence of a certain set of values, determined by the data structure of this type; a class also assumes the presence of many objects, but this set is defined by the user. Thus, types and classes play different roles, and rigor and unambiguity require simultaneous support for both concepts.

    Beeri does not present a complete formal model of the structural level of OODB, but expresses confidence that the current level of understanding is sufficient to formalize such a model. As for the behavioral level, only a general approach to the logical apparatus required for this is proposed (the logic of the first level is not enough).

    Beeri's important, although not well-founded, assumption is that the two traditional levels - schema and data - are not enough for OODB. For precise definition OODB requires a meta-schema level, the contents of which must define the types of objects and relationships allowed at the database schema level. The meta-schema should play the same role for OODB as the structural part of the relational data model plays for relational database schemas.

    There are many other publications related to the topic of object-oriented data models, but they either touch on rather specific issues or use mathematical apparatus that is too serious for this review (for example, some authors define an object-oriented data model based on category theory).

    To illustrate the current state of affairs, we will briefly consider the features of a specific data model used in the object-oriented O2 DBMS (this, of course, is also not a data model in the classical sense).

    O2 supports objects and values. An object is a pair (identifier, value), and the objects are encapsulated, i.e. their values ​​are accessible only through methods - procedures bound to objects. Values ​​can be atomic or structural. Structural values ​​are constructed from values ​​or objects represented by their identifiers using set, tuple, and list constructors. Structural value elements are accessed using predefined operations (primitives).

    There are two possible types of data organization: classes, whose instances are objects that encapsulate data and behavior, and types, whose instances are values. Each class is associated with a type that describes the structure of instances of the class. Types are defined recursively based on atomic types and previously defined types and classes using constructors. The behavioral side of a class is determined by a set of methods.

    Objects and values ​​can be named. The naming of an object or value is associated with the durability of its storage (persistency): any named objects or values ​​are durable; any object or value that is part of another named object or value is durable.

    Using a special instruction specified when defining a class, you can achieve long-term storage of any object of this class. In this case, the system automatically generates a set value whose name matches the name of the class. This set is guaranteed to contain all objects of this class.

    Method - program code, bound to a specific class and applicable to objects of that class. Defining a method in O2 is done in two steps. First, the method signature is declared, i.e. its name, class, argument types or classes, and result type or class. Methods can be public (accessible from objects of other classes) or private (accessible only within a given class). At the second stage, the implementation of the class is determined in one of the O2 programming languages ​​(the languages ​​are discussed in more detail in the next section of our review).

    The O2 model supports multiple class inheritance based on the supertype/subtype relationship. A subclass allows the addition and/or overriding of attributes and methods. Possible ambiguities in multiple inheritance (in the naming of attributes and methods) are resolved either by renaming or by explicitly indicating the source of inheritance. A subclass object is an object of each superclass from which the subclass is derived.

    A predefined class "Object" is supported, which is the root of the class lattice; any other class is an implicit successor of the "Object" class and inherits predefined methods ("is_same", "is_value_equal", etc.).

    A specific feature of the O2 model is the ability to declare additional "exclusive" attributes and methods for named objects. This means that a particular named object representative of a class can have a type that is a subtype of the class type. Of course, they don’t work with such attributes standard methods class, but specifically for a named object, additional (or overridden standard) methods can be defined, for which additional attributes are already available. It is emphasized that additional attributes and methods are tied not to a specific object, but to a name, which can generally be followed by different objects at different times. Implementation of exclusive attributes and methods requires the development of late binding techniques.

    In the next section, we will, among other things, look at the features of the programming languages ​​and queries of the O2 system, which, of course, are closely related to the specifics of the data model.