XML (eXtensible Markup Language)


XML is designed to facilitate the open exchange of information.   It defines a syntax that you can use to create your own markup language and provides access to complex data sets over multiple platforms.   XML is used to create custom tags that best describe the contents of data rather than the format or presentation of data. An XML file is made up of elements, each of which consists of a start tag (<title>), an end tag (</title>), and the information, or content, between tags. 

basic xml structure

For example, an XML element might be tagged as <name>, a <enzyme>, or a <sequence>.

Separating data from presentation enables the seamless integration of data from diverse sources.   Essentially, XML focuses on providing information about data and how it relates to other data.  HTML on the other hand, is concerned with the format of data.   The following demonstrates the difference between HTML and XML:

 

Comparison of HTML and XML tagging

As the example below demonstrates, XML and HTML differ in how tags are used.

<h2 align="center"> My Second-Level Heading, Centere d</h2>

<ul>

<li type="square"> My first list item, with square bulle t</li>

<li type="square"> My second list item</li>

</ul>

 

The example above shows how HTML tags describe content and layout. <h2>,

<ul>, and <li> are elements, sometimes called tags. In the HTML tagging above,

align is an attribute of the <h2> tag. XML uses elements and attributes in a sim-

ilar way. Below is an example of XML tagging that demonstrates how XML uses

tagging differently to organize information.


<Metabolic_reaction>
    <name>arginase catalysation</name>
    <compartment>cytosol</compartment>
    <reversibility>false</reversibility>
    <reactant>
        <name>L-arginine</name>
        <stoichiometry>1</stoichiometry>
    </reactant>
    <product>
        <name>L-ornithine</name>
        <stoichiometry>1</stoichiometry>
    </product>
    <enzyme>
        <name>arginase</name>
    </enzyme>
    <kinetics>
        <function>M-M</function>
    </kinetics>
</Metabolic_reaction>



Rather than describing the order and fashion in which the data should be displayed,

the XML tags indicate what each item of data means. This data can then

be decoded and used in specific and customizable ways.

 

Document Type Definition (DTD)
XML is able to generate elements that are tailor made by the specific group using the language. This is achieved through Document Type Definition.  A DTD is essentially a tag-set definition that lists the tags and attributes and tag-nesting conventions allowed.Each implementation of XML is defined by a DTD.  Therefore, it is the DTD that provides the consistency of how information is presented and ultimately shared.  DTDs let you specify the kinds of tags that can be included in the XML document, defining the rules of the document, such as which elements are present and the structural relationship between the elements. An XML parser reads in XML data from an input source and checks it against the rules defined in the DTD to make sure the data is structured correctly.

XML Schema

XML Schema is information modeling language for XML developed by W3C. It incorporates ideas from SOX and XDR, and can full support for datatypes
- Built in types (integer, boolean, etc.)
- Custom types (telephone numbers, etc.)

The differences between DTD and XML Schema in data typing and syntax:

comparing element declarations


(C) Ming Chen