• Using XML in Flash. How to open an XML document - features of working with XML files

    XML (Extensible Markup Language) was developed by the XML Working Group of the consortium World Wide Web Consortium (W3C). Here's how its creators describe it:

    "Extensible Markup Language (XML) is a component SGML language... It is designed to make it easier to use SGML on the Web and perform tasks that are currently accomplished using HTML. XML is designed to improve the use and interoperability of SGML and HTML."

    This is an excerpt from the XML specification version 1.0, created by the XML Working Group in February 1998. The entire document can be found on the W3C website at http://www.w3.org/TR/REC/-xml.

    XML is a markup language designed specifically for placing information on the World Wide Web, similar to hypertext markup HTML (Hypertext Markup Language), which initially became the standard language for creating Web pages. Since the HTML language completely satisfies all our needs, the question arises: why was it necessary to completely new language for the Web? What are its advantages and advantages? How does it interact with HTML? Will it replace HTML, or just improve upon it? Finally, what is SGML, of which XML is a part, and why can't SGML itself be used for Web pages? In this chapter I will try to answer all these questions.

    Purpose of XML

    The HTML language provides a fixed set of elements that you can use to place components on a typical Web page. Examples of such elements include headings, paragraphs, lists, tables, images, and links. For example, HTML is great for

    creating a personal home page. Below is the description of the home page in HTML codes:

    Home Page

    Michael Young's Home Page

    Welcome to my Web site!

    Web Site Contents

    Please choose one of the following topics:

    • Writing
    • Family
    • Photo Gallery

    Other Interesting Web Sites

    Click one of the following to explore another Web site:

    • "1. XML should become the language of direct use on the Internet."

      As you might have guessed, XML was designed primarily for storing and distributing information on the Web.

      "2. XML will support a large number of applications."

      Although its primary purpose is to distribute information on the Web through servers and browser programs, XML is also designed to be used by other programs. For example, XML is used to exchange information between financial programs, to distribute and update software products, and to write voice scripts when delivering information over the phone.

      "3.XML will be compatible with SGML."

      XML is a specialized branch of SGML. The advantage here is ease of adaptation software SGML for working with HTML.

      "4. It will be easier to write programs that process XML documents."

      For practical use XML is needed to make it easy to write browsers and other programs that process XML documents. In fact, the main reason for separating XML from SGML was the ease of writing programs to process XML documents.

      "5. Quantity additional functions in XML should be minimal, and ideally zero.”

      The minimal number of additional functions in XML makes it easy to write programs to process XML documents. The abundance of additional plug-in functions in SGML was the main reason that determined its practical unsuitability for representing Web documents. Additional SGML features require overriding delimiter characters for tags (usually ) and skipping the end tag so that the processor detects the end of the element. When strictly writing an SGML document processing program, it is necessary to take into account the possibility of all additional functions, even if they are rare.

      "6. XML documents should be clear and understandable to the user."

      XML is intended to become the lingua franca (universal language) for exchanging information among users and programs around the world. According to this concept, users, as well as specialized programs, should be able to create and read XML documents. Accessibility and transparency for the user distinguish XML from most other formats used in the construction of databases and text documents.

      The user can easily read the XML document because it is described in plain text and has a logical hierarchical tree structure. You can simplify XML documents by assigning meaningful names to elements, attributes, and objects, and by adding useful comments. (This will be discussed later in this chapter.)

      "7. XML development should be completed fairly quickly."

      XML will only become a widely accepted standard if programmers and users accept it. This standard must be created before society accepts the alternative standards that are increasingly being created by software companies.

      "8. XML should be formal and concise."

      The XML specification is written in a formal language used to represent computer languages, with a notation known as Extended Backus-Naur Form (EBNF). This formal language, although quite difficult to understand, is devoid of ambiguity and greatly facilitates the writing of XML documents, and especially programs for processing them.

      "9. XML documents will be easier to create."

      The practical use of XML as a markup language for Web documents simplifies not only the writing of processing programs, but also the process of creating the XML documents themselves.

      "10. The compressed form is not important in XML markup."

      In accordance with point 6 (the XML document must be clear and understandable to the user), the XML markup should not be overly compressed so as not to conflict with the specified purpose.

      Standard XML Applications

      You can use XML for more than just describing a single document. An individual, company, or standards committee can define the required set of XML elements and document structure to be used for a particular class of documents. Such a set of elements and a description of the document structure is called an XML application or an XML dictionary.

      For example, an organization might define an XML application to create documents describing molecular structures, human resources, multimedia presentations, or containing vector graphics. At the end of the chapter is a list of some common XML applications that have already been created and applications that are planned to be created.

      An XML application is typically defined by the creation of a document type descriptor (DTD), which is a valid component of the XML document. A DTD is built on a database schema: it establishes and defines the names of elements that can be used in a document, the order in which elements can appear, the attributes of elements that can be used, and other features of the document. To actually use an XML application, you typically include its DTD in your XML document; Having a DTD in a document limits the elements and structures you will use to ensure that your document meets the standards of that application. The XML document definitions discussed earlier in this chapter did not include DTDs. You'll learn how to define and use DTDs in Chapter 5.

      The benefits of using standard XML applications when developing your documents are that you can share the documents with all other users of the application, and the document can be processed and displayed using software that has already been built for the application.

      XML applications that improve the quality of XML documents

      In addition to XML applications for describing specific document classes, there are several XML applications that you can use within any type of XML document. These applications make document creation easier and improve its quality. Below are examples of such applications.

      • Extensible Stylesheet Language (XSL) allows you to create powerful stylesheets using XML syntax.
      • XML Schema allows you to develop detailed schemas for your XML documents using standard XML syntax, a more powerful alternative to using DTDs.
      • XML Linking Language (XLink) gives you the ability to link your XML documents. It supports multiple destination links and other useful features, providing greater freedom than HTML's linking mechanism.
      • XML Pointer Language (XPointer) allows you to define flexible target links. When XPointer and XLink are used together, you can link to anywhere in the target document - not just jumps to specific points.

      XLS will be covered in Chapter 10. Other XML applications are not yet mature and are not covered in this book. (XLink and XPointer are not supported in Internet Explorer 5).

      As you can see, XML is not only a useful tool for describing documents, but it also serves as the basis for building applications and extensions that may be in demand as the Internet evolves.

      Real Use of XML

      Although the concept of XML is quite interesting, you may be wondering how to put it into practice. This section provides a list of examples of such applications of XML, both already widely used and those in the future. If there are corresponding XML applications for practical use, they will be given in parentheses. For example, you may find that the MathML XML application will allow you to format mathematical formulas.

      Link. A more complete list of current and upcoming XML applications, including detailed descriptions, can be found on the Oasis SGML/XML Web page (http://www.oasis-open.arg/cover/ocml. htmW applications).

      • Working with databases. Like traditional databases, XML can be used to assign a label to each field of information within each database record. (For example, you can tag each name, address, and phone number within your address list entries.) You can then display the data in various ways and organize search, sorting, filtering and other processing of data.
      • Structuring documents. The hierarchical structure of XML documents is ideal for marking up the structure of documents such as novels, scientific papers, and plays. For example, you can use XML to mark up a play into acts, scenes, characters, plot lines, scenery, etc. XML markup allows programs to display or print the document in the required format; find, extract, or manipulate information in a document; generate tables of contents, summaries and annotations; process information in other ways.
      • Working with vector graphics(VML - Vector Markup Language).
      • Multimedia presentations (SMIL - Synchronized Multimedia Integration Language, HTML + TIME - HTML Timed Interactive Multimedia Extensions).
      • Description of channels. Channels are Web pages that are automatically sent to subscribers. (CDF - Channel Definition Format).
      • Description of software packages and their relationships. Such descriptions ensure the distribution and updating of software products on the network (OSD - Open Software Description).
      • Application communication over the Web using XML-co-communications. These messages are operating system independent, object models and computer languages ​​(SOAP - Simple Object Access Protocol).
      • Sending electronic business cards via e-mail.
      • Exchange of financial information. Information is exchanged in an open and understandable format between financial programs (such as Quicken and Microsoft Money) and financial institutions (banks, public funds) (OFX - Open Financial Exchange).
      • Create, manage and use complex digital forms for commercial Internet transactions. Such forms may include digitized signatures that make them legally recognized (XFDL - Extensible Forms Description Language).
      • Exchange of job requests and resumes (HRMML - Human Resource Management Markup Language).
      • Formatting mathematical formulas and scientific information in
      • Web (MathML - Mathematical Markup Language).
      • Description of molecular structures (CML - Chemical Markup Language).
      • Encoding and displaying information about DNA, RNA and chains (BSML - Bioinformatic Sequence Markup Language).
      • Coding of genealogical data (GeDML - Genealogical Data Markup Language).
      • Astronomical data exchange (AML - Astronomical Markup Language).
      • Creation of musical scores (MusicML -Music Markup Language).
      • Working with voice scripts to deliver information over the phone. Voice scripts can be used, for example, to generate voice messages, stock statements and weather forecasts (VoxML).
      • Information processing and delivery courier services. Federal Express, for example, already uses XML for this purpose.
      • Presentation of advertising in the press in digital format (AdMarkup).
      • Filling out legal documents and electronic exchange of legal information (XCL - XML ​​Court Interface).
      • Coding of weather forecasts (OMF - Weather Observation Markup Format).
      • Exchange of information on real estate transactions (RETS - Real Estate Transaction Standard).
      • Exchange of insurance information.
      • Exchange news and information using open Web standards (XMLNews).
      • Presentation of religious information and markup of liturgical texts (ThML - Theological Markup Language, LitML - Liturgical Markup Language).

      I think that you already understand why you need HTML(Yes, HTML). It is needed to present data in the browser. That is, there is HTML code and corresponding to this HTML code a certain type. However, modern trends require not only the display of data, but also their competent internal structure.

      That's it for creating a structure and there is an XML language. Simple example:

      Green apple

      For us people, everything immediately becomes clear. An image immediately appears in my head " green apple", however, how to explain to a computer that this is an apple, not an orange, a person or our galaxy? Here again it comes to the rescue XML, where we can create any tags, making it clear where the apple is, where the orange is, where the person is, and where our galaxy is. I hope I explained it clearly.

      Now about the most important thing. The main feature of XML is its versatility. That is XML anyone understands modern language. And since XML is text file , then you can work with it in a regular notepad. Now specifically to practice, where is XML used:

      • Settings file. Settings in XML file very easy to read and write. For this reason, there are hundreds of XML files.
      • Data bridge between programs written in different languages. A very important feature resulting from the versatility of the language, and it is regularly used in complex systems.
      • Data storage. In fact, this is a kind of database analogue, but does not require DBMS(For example, MySQL). And thanks to the query language XPath it becomes possible to easily communicate with this " database".

      And finally, from my practice I can give the simplest example. I have a sitemap in XML format on my website. There are links to all pages of the site. This is a very convenient and important thing for good site indexing, however, you have to manually add it there every time new page uncomfortable. Therefore, thanks to knowledge of working with XML, I easily automated this matter. So XML is a useful language, which any programmer needs to know at least in general terms.

      Why is this XML needed?

      [Recently, in connection with the appearance of these pages, the most frequent
      The question for me turned out to be: “Tell me, why is it needed at all, XML?
      Isn't HTML enough for us? "Not having much time (or mind;) to prepare my own publications, and also deeply respecting the classics, I preferred to quickly translate an excellent article on the title subj
      - perhaps this is the first episode of the series "to help" ]

      Jon Bosak, Tim Bray
      XML and the Second-Generation Web
      from Scientific American, May 1999

      Give people a couple of tips, and they'll figure out the rest on their own. Looking at the page, where larger blocks of text are divided into smaller ones, everyone quickly realizes that this is the beginning of an article. Looking at the grocery list, you can quickly guess that these are “instructions” for visiting the store. When you see the columns of numbers, you understand that this is a bank account. Computers are not yet that smart - all this has to be conveyed to them exactly - what exactly they must deal with and what is required of them for this.

      It is for this purpose - to make information self-describing - that a new document markup language was invented - Extensible Markup Language (XML). These easily pronounced changes (a “self-described” document, a change in the rules of communication with computers) carry enormous potential - the role of the Internet from an information delivery medium begins to expand to other types of human activity. Indeed, since its approval by the W3C in 1998, the XML specification began to penetrate everywhere like wildfire - into industry and science, into the production of goods and medicine.

      Enthusiasts hoped that XML would provide an opportunity to solve a number of global problems of the Web. These problems are known: firstly, the Internet, a super-fast network, often behaves worse than a turtle; and secondly, although almost all the information is available on the Internet, it is often maddeningly difficult to find something necessary there.

      Both of these problems are caused mainly by the nature of the main language of the Web - HTML. And although the success of HTML compared to other languages ​​ever proposed electronic publications It's obvious that HTML is too sparse: it basically just tells the browser how to place text, images, and buttons on the page. HTML focuses on presentation of information and is therefore fairly easy to learn, but it comes at a cost.

      This is reflected in the complexity of developing web sites, unless these sites are meant to be like fax machines, sending out pages to anyone who asks. More than half of people and companies around the world would prefer websites that can take orders from users, send disease diagnoses, and even run delicate instrumental operations on factory floors and scientific laboratories. Such tasks have _NEVER_ been faced by HTML!.

      For example, even if your doctor is able to “extract” tests taken from your medical card into his viewer, it is unlikely that he will be able to send them over the network to another specialist in order to insert the received answer back into his database. His computer doesn't know what to do with information that is as clear to him as

      bee bee

      or bee bee.
      The legendary Kernighan once noted that the whole trick of the WYSIWYG principle (what I see is what I get) is that when you see nothing, you usually manage to get exactly the same amount.

      Those words above that are enclosed in angle brackets are called tags. There is no parsing tag in HTML, and hence its other drawback: inflexibility. Adding a new tag to a language is such a bureaucratic red tape, so lengthy that no one will bother with it. But it would be nice for every program to have its own tags, not just the one in the example with the doctor.

      This largely explains the current slow pace of creation of online stores, mail-order catalogs and other interactive sites. If you change the number of order units and the shipping method, and see a handful of numbers changed in the “amount” field, you will still have to ask the remote (already overworked) server to send you back a complete newly generated page with graphics and everything else. While your own powerful computer will sit idle because it just learned something like

      And , but not prices with delivery options.

      Add to this the poor quality of Web search capabilities. Since there is no way to specifically mark price information, it is absolutely impossible to search the web for pages based on “price.”


      Something old, something new

      In principle, the solution is simple: the tags need to indicate what kind of information it is, and not what it should look like. For example, mark up the components of an order for a shirt with the tags “price, size, quantity, color” rather than “bold, paragraph, row, column”, as suggested in HTML. Then it’s easier for the program to identify the document as an order and do the rest of the work: display this order in one form or another, put it through the accounting system, or make sure that the new shirt is delivered to your doorstep the next day.

      We, the W3C working group, began developing such a project back in 1996. The idea was strong, although not entirely original. For generations, editors and printers have labeled handwritten texts notes for typesetters. Such a “markup language” developed independently until 1986, when, as a result of ten years of work, the International Organization for Standardization (ISO) introduced a system for creating new markup languages.

      Given the name SGML (Standard Generalized Markup Language), this language description language - a metalanguage - has proven its usefulness in many large systems preparation of publications. And even HTML got its definition through SGML. The only difficulty with SGML was that it was omnivorous - there were a lot of clever things there to minimize keystrokes, since at that time every byte counted. That's why web browsers today don't work well with it.

      By creating XML, our working group stripped SGML of its husks and proposed a highly targeted and digestible metalanguage. The XML base is a set of rules, guided by which, anyone can create their own markup language. These rules are chosen so that only one small program(it is also called a parser or syntactic analyzer) could cope with the recognition of any new language. Let's look again at the example of a doctor who would like to transfer tests to a specialist. If medical professionals were to construct their own markup language from XML to encode physician notes (a number of groups have been working on the problem for a long time), then a message from a doctor to his colleague might contain something like


      <имя пациента>blah blah
      <аллергия на лекарство>blah blah blah

      In this setting, it is no longer difficult to write a program for an arbitrary computer so that it can recognize these standardized medical records and be able to enter this literally vital important information to your database.

      Just as HTML was designed to allow anyone to read Internet documents, XML gives us an Esperanto that anyone can read and write, despite a babel of incompatible platforms. Yes, even from the point of view of an ordinary person in XML language more semantic load (unlike other data formats), because there is nothing in it that would look like unreadable text.

      The power of XML's versatility comes from a minimal set of well-chosen rules. Firstly, tags always form a pair, surrounding the text to which they are applied with peculiar brackets. Secondly, paired tags can be nested inside each other like quotation marks, allowing you to build complex multi-level structures.

      The nesting rule automatically enforces the simplicity of any XML document, producing a structure known in computer science as a tree. Similar to a family tree, any graphic or text element of a document is the father, son or brother (parent, child, sibling) of some other element, and this relationship is always unique. Of course, trees do not describe the entire variety of data structures, but they cover most of the typical cases of computer use. In addition, trees are extremely convenient for programmers. There is no problem in writing a small piece of code to reorder transactions or display a completely understandable receipt when the receipt is represented as a tree.

      The second source of XML's universal power is its reliance on the new Unicode standard, an encoding system that allows text in all the world's major languages ​​to be intermingled. On the contrary, in HTML, as in the mass word processors, a document, as a rule, can only be in one specific language, no matter which one - English, Japanese or Arabic.
      And if the program does not know the encoding of a certain language, you can forget about the document (in HTML). It can be worse: for example, due to inconsistency of encodings, programs written in Taiwan often cannot read texts aimed at mainland China. In the case of XML, if the program knows how to work with it correctly, it can handle any combination of encodings. Thus, XML not only allows data to be exchanged between different computer platforms, but also makes it possible to overcome national and cultural barriers.


      End of the World Wide Wait

      With the rise of XML, the Web should become much more responsive. Today, everything that computer devices on the network can do, no matter whether they are powerful desktops or pocket organizers, is nothing more than receiving a form via “GET”, filling it out, then sending it back and forth to the web server while working with the form will not be completed. XML gives us the ability to convey the structure and semantics of the data into the form, and therefore all those devices can do the basic processing in the right place and immediately. This will not only reduce the load on the servers, but should also lead to a significant reduction in network traffic.

      To illustrate, imagine using an online travel agency to find a flight from London to New York on the 4th of July. Most likely, you will see a list several times longer than can fit on the screen. This list can be shortened by setting more precise parameters such as departure time, price or airline, but in this case you simply “load” the travel agency server with your request and have to wait for a response. However, if this long list of flights were provided to you in XML, then the bureau could accompany it with a small Java applet, with the help of which you can instantly and easily sort and filter out the unnecessary ones, without resorting to any interaction with the server. Multiply this by millions of Web users, and the overall effect is impressive.

      The more online information that is tagged with industry-specific XML tags, the easier it will be to find what you're looking for. Today, an Internet search for “jobs for a stockbroker” will overwhelm you with an avalanche of advertisements, but there will probably be only a few of them about work - most of the work is hidden on free bulletin boards of newspaper sites that search robots do not like to work with. And now the Newspaper Association of America is creating its own ad markup language in XML, which promises to make the search process much more efficient.

      It doesn't matter if it's just an intermediate step. Librarians have long known ways to find something quickly - by looking not at documents, but at their compact key descriptions, which only point to the sources themselves. Namely, these are catalogs with a sample in the form of library cards. Such information about information is called “metadata”.

      Therefore from the very beginning important role The XML project focused on creating an accompanying metadata standard. February's Resource Description Framework (RDF) should play the same role that index cards did for library books for information on the Web. As RDF metadata spreads across the Web, it will make search much faster and more relevant than it currently is. There are no librarians on the Web, but every webmaster also wants his site to be easily found, so we expect that RDF, once people discover its power, will have a huge impact on the Internet.

      Of course, information can be obtained without searching. After all, the Web is hypertext - billions of pages riddled with hyperlinks - those underlined words that you just have to click on to be whisked away to some other page. In XML, the hyperlink mechanism is also greatly enhanced. The XML linking specification, called XLink, which the W3C is preparing by the end of the year, will allow the user to choose from multiple destinations. Another type of hyperlink will allow you to receive text or an image directly at the point of clicking, allowing the visitor not to leave the page.

      Perhaps the most useful part of XLink will be the part of the specification that allows authors to resort to indirect links, sending instead of the pages themselves to some kind of summary database. So, if the author has changed the address of the page, by simply editing one entry in such a database, it is easy to update all the links leading to his page. This will allow you to get rid of the increasingly common “404 File Not Found” messages that indicate a “broken” link.

      A combination of more efficient processing, more precise searches and more flexible linking are revolutionizing the structure of the Web and opening up entirely new methods of accessing information. For users, this new Network will be significantly faster, more powerful and more useful than today's Network.


      Cooperation needed

      Of course, not everything is so simple. XML allows anyone to design a new language in their own way, but creating a good language is a task whose difficulty should not be underestimated. Coming up with a language is just the beginning: it's naive to expect the meanings of your tags to be obvious to other people until you provide a manual for the language, and to be clear to computers until you write programs that work with the language's tags.

      It is not difficult to explain why this is so. If all that was needed to teach a computer to process orders were tags, then XML wouldn't be needed. There wouldn't even be a need for programmers, since computers are smart enough to do everything on their own.

      Why we need XML is not magic, but efficiency. XML establishes ground rules that simplify the details of programming in one layer - so that people with similar interests can concentrate on the other hard nut to crack - agreements about exactly how they want the data they want to exchange. This is a very difficult problem, although not new.

      And there will be such agreements, since the growing incompatibility of computer platforms results in delays in deadlines, financial losses and leads to confusion in almost all areas of activity. People want to exchange ideas and get things done, no matter what everyone else has different computers- and for this to become a reality, the mutual development of private (for different fields of activity) languages ​​still has a long way to go. However, the flurry of new acronyms ending in "ML" demonstrates the undeniably innovative spirit that XML has brought to science, business and education.

      When creating a new XML markup language, its creators must agree on three things: what tags will be there, how they can nest within each other, and how they should be processed. The first two points - language dictionary and structure - are now encoded using DTD (Document Type Definition). The XML standard does not oblige language developers to resort to DTDs, but most new languages ​​will apparently have DTD descriptions - this makes it easier for programmers to write programs that understand this markup and extract something meaningful from it. We will also need sets of manuals that describe the meanings of all tags in human language. For example, HTML has a DTD description, but there are also hundreds of pages of familiar HTML manuals that programmers consult when developing browsers and other programs for the Web.


      Essay on style

      For users, the main thing is what the program can do, and not what is written in its description. In general, people prefer that programs allow them to see XML-encoded information in a readable form. But in the XML tags itself there is no special markup indicating. how data should be presented on a screen or printed sheet.

      For publishers seeking to “write once and publish everywhere,” the most important thing is to “give birth” to a publication and then “pour” it into a myriad of types of publications, both print and electronic. XML helps them in this way: content is marked with descriptive tags that are independent of the rendering environment. Next, the publisher can formalize the presentation rules in the form of so-called. stylesheets (style sheets), automatically “stylizing” his work as different devices and environment. The standard for such an XML language, developed for these purposes, is called Extensible Stylesheet Language (XSL).

      Latest versions Browsers can read XML documents, select appropriate style files, and use them to sort and format information on the screen. The reader may not even realize that he is dealing with XML rather than HTML unless he notices that sites with XML are faster and easier to use.

      Visually impaired people also benefit freely from XSL document publishing principles, since XSL gives them the ability to read XML in Braille or by voice. These advantages apply to others as well: for example, a traveling salesman who wants to surf the Internet from the comfort of his car would probably find it quite convenient to listen to pages with sound.

      Although at first the core of the Network consisted of scientific and educational programs, today's Network is already commerce (or, one might say, commercial expectations), storing fuel for a quick start. Everyone remembers the recent resonance caused by the surge in online sales, but needless to say how quickly businessmen interact with each other online. The flow of goods from large manufacturers is begging for automation on the network. But today's business designs rely on complex program-to-program interactions, and in practice this works very poorly, because success requires uniformity of processing processes, which is still far from achieved.

      For centuries, people have successfully done business by exchanging standard documents: orders, invoices, declarations, receipts, etc. etc. The documents worked for the business, and no one required that one party involved know the inner workings of the other. Any document was shown exactly as much as it should have been shown to the recipient of the information, and no more. Apparently, exchanging documents is the most correct way to do business on the Web too. But this was not at all the task for which HTML was created.

      Conversely, XML is designed specifically for the purpose of exchanging documents and it is obvious that the basis e-commerce will rely on conventions expressed by millions of XML documents floating around the Internet.

      Thus, the XML-enhanced Web should become a fast, friendly, and better place for business for its users. Even more XML is needed by webmasters and web designers. Armies of programmers will need knowledge of new XML languages ​​"to the fullest." And although the days of self-educated hackers [the authors meant the best sense of this word] still last, their population is already under threat.

      Tomorrow's web designer must be proficient not only in producing text and graphics, but also in building multi-layered, interdependent systems based on DTDs, data trees, hyperlink structures, metadata and style components - a strong and advanced infrastructure of the second generation Web.

      Let's consider the technology of using XML to transfer data to the server.

      We have already looked at 2 ways to transfer data to the server: plain text with a delimiter and JSON. But they have disadvantages:

      • Lack of data types. JSON only has string, number, null, boolean. Those. limited data set.
      • It is difficult to control the integrity of transmitted data.
      • Difficult to visualize data, e.g. complex objects difficult to display, for example, in html form code.
      • Difficult to transform data, i.e. It is difficult to transform the properties of one object into the properties of another object.

      Now let's turn to XML as a method of data transfer. XML(eXtensible Markup Language) is a markup language designed to describe, store and transmit structured data. Today XML is used everywhere.

      There are many technologies based on XML: DOM (programmatic interaction with data), XLink (pointers and links), XPath (description and selection of elements), XSL, XSLT (XML document transformation).

      Parsing the XML package looks like this:

      //XMLHttpRequest object var req= getXmlHttpRequest (); //Installing the handler req. onreadystatechange= function () ( if (req. readyState== 4 ) ( //state "4 - comlete" var xml= req. responseXML; ))

      Here you don’t even need to do serialization and deserialization. This is done by the object itself. As soon as the server sends XML data, it is already in parsed form (responseXML - DOM document model). Read more about DOM technology in previous articles on the site.

      Sometimes, for debugging, you need to serialize and deserialize XML data (to transfer data to the server, this is done automatically, you do not need to do it manually). Let's serialize into a string:

      //for IE var str = dom. xml //for Firefox var serializer = new XMLSerializer (); var str = serializer. serializeToString(dom);

      For IE the working code is shorter, because it already has a built-in object for serialization, while in other browsers the XMLSerializer object for serialization only appears.

      When working with XML data, we usually deal with the DOM model of the document. Therefore, it is worth remembering some aspects of the DOM (read about this in previous articles). Let me just remind you of the ways to access a DOM element of the model:

      //root element var root = xmlDOM. documentElement; //first element in the collection var book = root. childNodes[ 0 ]; //child element var title = book. childNodes[ 0 ]; //element text node alert(title. firstChild. nodeValue);

      You can also select elements of the same type from the DOM of the document model. Attention! There are no getElemensById functions, because in XML id can mean anything, not just an identifier, so it is not used.

      //selecting all elements with one tag var books = xmlDOM. getElementsByTagName("book");

      XML is also actively used not only for data representation, but also for data exchange in server-oriented architecture. This is an approach in which we present a complex application not as a classic client-server application, but as a set of services, each of which is responsible for its own tasks. And each service has entry points (interaction points). There is no clear client here, because... one service can be a client of another service. This turns out to be a distributed technology. There are several approaches to building such technologies - remote procedure call, SOAP.

      In order for services from different clients to interact, they must speak the same language (it does not matter which operating system at the service). And such a language was developed and called RPC.

      XML-RPC protocol

      RPC(Remote Procedure Call) - remote procedure call. This is a protocol for interaction between two remote points. It allows point “a” to call a function on remote point “b”.

      There are several implementations of the RPC protocol. Let's look at an XML-based implementation.

      Essentially, the client and server simply exchange some XML fragments.

      XML-RPC provides following types data:

      • boolean.
      • integer.
      • double.
      • string.
      • date/time.
      • base64.
      • array.
      • struct.
      • null.

      Those. When passing a certain data type, you must declare what the data type is. The structure is similar to a JSON object.

      Converting XML Data

      To transform data received from the server in the form of XML, XSLT is used.

      XSLT(eXtendable Stylesheet Language Transformation) is a technology that helps you receive XML as input and form whatever you want as output.

      Converting XSLT to javaScript - IE

      var dom = new ActiveXObject("MSXML2.DOMDocument"); dom. async = false; dom var xsl = new ActiveXObject(" [email protected]" ); xsl. async = false; xsl. load ("my.xsl" ); //the transformation itself var result = dom. transformNode(xsl);

      Convert XSLT to JavaScript for Firefox. Ghrome, Opera

      var xslStylesheet; var xsltProcessor var myXMLHTTPRequest = mew XMLHttpRequest(); myXMLHTTPRequest. open("GET" , "example.xsl" , false); myXMLHTTPRequest. send(null); //get xml xslStileshett = myXMLHTTPRequest. responseXML; xsltProcessor myXMLHTTPRequest = new XMLHttpRequest(); myXMLHTTPRequest. open("GET" , "example.xml" , false); myXMLHTTPRequest. send(null); //the transformation itself var xmlSource = myXMLHTTPRequest. responseXML; var resultDocument = xsltProcessor. transformToDocument(xmlSource);

      Introduction to Proper Markup

      XML means Extensible Markup Language with an emphasis on markup(marking). You can create text and mark it up with framing tags, turning every word, sentence, or fragment into identifiable, sortable information. Files you create, or copies of the document, consist of elements (tags) and text, and the elements help to correctly understand the document when reading on paper or even process it in electronic form. The more descriptive elements, the more parts of the document that can be identified. Since the early days of markup, one of its advantages is that if it is lost computer system printed data still remain readable thanks to tags.

      Markup languages ​​have evolved from the first forms created by companies and government agencies, to Standard Generalized Markup Language (SGML), Hypertext Markup Language (HTML), and ultimately to XML. SGML may seem complex, and HTML (which was essentially just a collection of elements at first) has proven to be not powerful enough to identify information. XML was designed to be an easy-to-use and easy-to-extend markup language.

      In XML, you can create your own elements, allowing you to accurately represent pieces of data. Documents can not only be divided into paragraphs and headings, but also any fragments within the document can be highlighted. For this to be effective, you need to define a final list of your elements and stick to it. Elements can be defined in a Document Type Definition (DTD) or in a schema, as discussed briefly below. Once you've mastered and started using XML, don't be afraid to experiment with element names as you create actual files.

      Building an XML Document

      As mentioned, XML files consist of text and markup. Most text is placed in elements where the text is surrounded by tags. For example, let's say you want to create a cookbook in XML format. We have a recipe called Ice Cream Sundae, which needs to be converted to XML. To mark up the name of the recipe, we enclose its text in an element that begins and ends with tags. This element can be called recipename . To mark the start tag of an element, place its name in angle brackets<>), like this: . Then enter the text Ice Cream Sundae. After the text we put a closing tag, which represents the name of the element in angle brackets, plus an element trailing slash (/) before the element name, like this:. These tags form element, into which you can enter text and even other elements.

      Element names can be created for individual documents or for groups of documents. According to your requirements, you can specify the rules that must be followed for the elements. Elements can be strictly specific or quite general. The rules must also define what is acceptable to include in each element. They can be strict, loose or in between. Simply create elements that define the parts of your document that you think are important.

      Start creating the XML file

      The first line of an XML document can be an XML declaration. This optional part of the file identifies it as an XML file, which can help automated tools and humans recognize the file as XML rather than SGML or other markup.

      The declaration might look simply like or include the XML version ( ) and even character encoding, for example,for Unicode. Because this declaration must be at the very beginning of the file, if you plan to combine small XML files into a larger file, it is best to skip this optional element.

      Creating a root element

      The start and end tags of the root element surround the entire text of the XML document. There should be only one root element in the file, and this is the required "cover" for it. shows a snippet of the example I'm using here, with a root element (recipe). ( Full file XML is given in .)

      Listing 1. Root element

      As you create your document, you will place text and additional tags between And .

      Names of elements

      Case respect in tags

      When creating XML, the case of the start and end tags must match. Otherwise, you may receive an error message when using or viewing the XML. For example, Internet Explorer does not display text if there is a case mismatch. Instead, it displays messages about a mismatch between the start and end tags.

      So we have a root element . In XML, element names are first selected and then, based on those names, the corresponding element is determined. DTD description or diagram. Names can contain letters, numbers, and special characters such as the underscore (_). Here are a few rules about names to remember:

      • Spaces are not allowed in element names.
      • Names must begin with a letter, not a number or sign. (After this first letter, you can use any combination of letters, numbers, and valid symbols.)
      • Case does not matter, but be sure to follow it to avoid confusion.
      Listing 2. Other elements
      Ice Cream Sundae 5 minutes

      An XML document can contain empty tags, which have nothing inside them and can be expressed as a single tag rather than a pair of start and end tags. For example, this could be a standalone tag in HTML style . It does not contain any child elements or text, so it is an empty element and can be written as (with a space and the familiar trailing slash at the end).

      Nesting elements

      Attachment is the placement of elements inside other elements. These new elements are called subsidiaries elements, and the elements that surround them are their parents elements. In the root element several elements are nested. These are nested children , And Inside an element there are several identical child elements . Nesting can make an XML document multi-level.

      Typical syntax error associated with the nesting of parent and child elements. Each child element must be positioned entirely between the opening and closing tags of its parent element. Child elements must end before the next child begins.

      An example of a correct attachment is given in. Tags start and end without weave with other tags.

      Listing 3. Correct nesting of XML elements.
      Ice Cream Sundae 3 chocolate syrup or chocolate fudge 1 nuts 1 cherry 5 minutes

      Adding attributes

      Elements are sometimes added Attributes. Attributes consist of a name-value pair, where the value is enclosed in double quotes ("), like this: type="dessert" . Attributes allow you to save along with the element additional options, changing the values ​​of these parameters from element to element in the same document.

      An attribute—or even multiple attributes—is specified within the element's start tag: . When adding multiple attributes, they are separated by spaces: . shows the XML file as it looks now.

      Listing 4. Our XML file with elements and attributes
      Ice Cream Sundae 5 minutes

      Any number of attributes can be used. Consider what details you can add to your document. Attributes are especially useful if documents will be stored - for example, by type of recipes. Attribute names can contain the same characters as element names, with the same rules for excluding spaces and starting the name with a letter.

      Correctly and incorrectly constructed XML

      If you follow the rules defined in your framework, you can easily create well-formed XML code. Correct XML is XML code compiled in compliance with all XML rules: correct naming of elements, attachment, naming of attributes, etc.

      Depending on what exactly you do with XML, you may need to work with well-formed XML. Consider the above example of sorting by recipe type. It is necessary that the elements contained the type attribute. It is very important to be able to successfully test the code and ensure that the value of this attribute is always present.

      Under verification (validation) refers to checking the document structure for compliance with the rules established for it and the definition of child elements for each parent element. These rules are defined in Description of the document type(DTD) or in the diagram. This type of validation requires you to create a DTD or schema and then reference the DTD or schema file in your XML files.

      To enable validation, you need to place a document type declaration (DOCTYPE) near the beginning of your XML documents. This line contains a link to the DTD or schema (list of elements and rules) that will be used to validate this document. The DOCTYPE string could be something like the one in .

      Listing 5. DOCTYPE

      This example means that your item list file named filename.dtd is located on your computer (that is, in the SYSTEM directory, not in the public PUBLIC directory).

      Using Entities

      Entities may be text fragments or special characters. They can be specified inside the document or outside it. To avoid errors and to display correctly, entities must be properly declared and expressed.

      You cannot enter special characters directly into text. To use special characters in text, you need to make them entities and use the codes of these characters. You can define phrases, such as a company name, as entities, and then use them throughout your text. To create an entity, give it a name and insert that name and insert that name into the text after the ampersand (&) and ending with a semicolon—for example, (or another name). Then put this code in your DOCTYPE line in square brackets(), as in . This code specifies the text that is substituted for the entity.

      Listing 6. Entity

      Using entities helps avoid repeating the same phrase or information over and over again. It can also make it easier to edit text (for example, if a company changes its name) in many places at once by simply setting up an entity definition string.

      How to avoid mistakes

      As you learn to create XML files, open them in an XML editor to ensure they are formally correct and to ensure that XML rules are followed. For example, if you have Windows® Internet Explorer®, you can simply open your XML file in the browser. If your elements, attributes, and text are displayed, then the XML file is composed correctly. If there are errors, you probably messed up something in the syntax, and you need to carefully check your document for typos or missing tags and punctuation.

      Conclusion

      Having learned a few simple rules, you have the flexibility to develop your own XML elements and their attributes. XML rules are not complicated. Typing an XML document is also easy. The key is to understand what you want from your documents in terms of sorting and searching capabilities, and then design elements and attributes to meet those requirements.

      When you understand your purpose well and know how to mark up your text, you can create effective elements and attributes. From this perspective, careful markup is all that is needed to create a well-formed and usable XML document.