XML Document Rule, XML Structuring, XML Presentation Technologies
XML Document Rule, XML Structuring, XML Presentation Technologies
XML document rule, XML structuring, XML presentation technologies, XML Transformation,
XSLT, XQUERY, XLINK, XPATH
SRM/MCA/HS 23
o To avoid this error, replace the "<" character with an entity reference:
<message>if salary < 1000 then</message>
An XML element, in turn, consists of a start tag and an end tag, except in the case of elements
that are defined to be empty, which consist only of one tag. A start tag (also called an opening
tag) starts with < and ends with >. End tags (also called closing tags) begin with </ and end with
>. The XML specification is very specific about tag names; you can start a tag name with a letter,
an underscore, or a colon. The next characters may be letters, digits, underscores, hyphens,
periods, and colons (but no whitespace).
<DOCUMENT> <2003DOCUMENT>
<document> <.document>
<_Record> <Record Number>
<customer> <customer*name>
<PRODUCT> <PRODUCT(ID)>
Exercise 3:
SRM/MCA/HS 24
Exercise 4:
The fundamental unit of XML content is the element, which is an author-specified chunk of
information. An element consists of an element name and element content. Consider the
example an annotated version of our business card document to see examples of these content
types.
SRM/MCA/HS 25
<ContactMethods> element content
<Phone>650-555-5000</Phone> data content
<Phone>650-555-5001</Phone> data content
</ContactMethods>
</BusinessCard>
In the above example, "BusinessCard" is the top-level element. In XML, there can be only one
element at the top level. This element is called the document element or sometimes root element.
Think of this element as the trunk of the tree from which all other elements branch. The
following figure shows the corresponding tree for the above example with each node
representing an element and identified with the element name. Conceptually the element content
resides within the node.
Empty elements:
Empty elements have only one tag, not a start and end tag and close an empty element with />
Root element
Each well-formed XML document must contain one element that contains all the other elements.
The containing element is called the root element. An XML document must have a single root
tag, such that all other tags are contained within that root tag. All subsequent elements must be
contained within the root tag, each nested within its parent tag.
SRM/MCA/HS 26
Exercise 6: Example of Root element <PLANT> is root element
<?xml version="1.0"?>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
</PLANT>
Child elements
The root node has only children. All other nodes have one parent node, as well as zero or more
child nodes. Nodes can have elements that are related on the same hierarchical level.
<?xml version=‖1.0‖?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
The elements in an XML document form a document tree. The tree starts at the root and
branches to the lowest level of the tree. The example of tree structure for the Exercise 7 using
one of the XML editor ‗firstobject‘ is shown as follows:
SRM/MCA/HS 27
XML elements are extensible
Exercise 8: For example Exercise 6 can be further added with few more elements
<?xml version="1.0"?>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
<LIFE> 12 </LIFE>
<COUNTRY> AFRICA </COUNTRY>
</PLANT>
(iii)XML attribute
In addition to tags and elements, XML documents can also include attributes. Attributes are
simple name/value pairs associated with an element. Attributes must have values–even if that
value is just an empty string (like "")
Exercise 9: They are attached to the start-tag, as shown below, but not to the end-tag:
SRM/MCA/HS 28
From a purely programming perspective, it could be stated that attributes should not be used
because of the following reasons:
Elements help to define tree structure and attributes do not.
Attributes are not allowed to have multiple values whereas elements can.
Programming is more complex using attributes.
Attributes are more difficult to alter in XML documents / not easily expandable at a later
stage.
Attributes are difficult to read and maintain. Use elements for data. Use attributes for information
that is not relevant to the data.
XML Attributes for Metadata: Attributes Can Provide Meta Data that May Not be Relevant to
Most Applications Dealing with Our XML.
Exercise 11: Attribute is used as meta data to identify the element: Sometimes ID references are
assigned to elements. These IDs can be used to identify XML elements in much the same way as
the id attribute in HTML. This example demonstrates this:
<?xml version=‖1.0‖?>
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget to XML Project!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
SRM/MCA/HS 29
(iv) Prolog
The prolog of an XML document, when present, precedes the document element. The prolog
may, but need not, contain the following:
An XML declaration
Miscellaneous content—processing instructions or comments
A Document Type Declaration, also called a DOCTYPE declaration
<song>
<title>Like A Surgeon</title>
<length>
<minutes>3</minutes>
<seconds>33</seconds>
</length>
<parody>
<title>Like A Virgin</title>
<artist>Madonna</artist>
</parody>
</song>
SRM/MCA/HS 30
<song>
<title>Dare to be Stupid</title>
<length>
<minutes>3</minutes>
<seconds>25</seconds>
</length>
<parody/>
</song>
<!--There are more songs on this CD, but I didn't have time
to include them!-->
</CD>
Exercise 15:
SRM/MCA/HS 31
(viii) XML tools
XML editors: They are used to create XML document. Some of them are firstobject, Adobe
frameworker, XML pro, Altova XML spy, stylus studio, xml writer, enotepad, xml notepad.
XML browsers: IE 6, Netscape Navigator 6, Jumbo
XML parsers: MSXML, SAX, expat, expat perl module, TClExpat, LT XML, XML for Java,
XML test pad, XP, SXP, Python and XML processing preliminary XML parser
XML validators: W3C XML validator, Tidy, XML.com lark parser, LTP, STG, VS.net
(ix) XML editors help to type and identifying the errors while parsing the XML document
Exercise 16: Open the Exercise 11. Do few mistakes and view the error in the document
SRM/MCA/HS 32
3. XML Structuring
An XML document actually can do more than just hold data. Therefore it is required to specify
the structure of that data as well. This structuring is very important when document is dealing
with complex data. For example, you could store a long account statement in HTML, but after
the first ten pages or so, that data would be prone to errors. But in XML, you can actually build
in the syntax rules that specify the structure of the document so that the document can be
checked to make sure it's set up correctly.
This emphasis on the correctness of your data's structure is strong in XML, and it makes it easy
to detect problems. In HTML, a Web author could (and frequently did) write sloppy HTML,
knowing that the Web browser would take care of any syntax problems. In fact, some people
estimate that 50% or more of the code in modern browsers is there to take care of sloppy HTML
in Web pages. But things are different in XML. The software that reads your XML—called an
XML processor—is supposed to check your document; if there's a problem, the processor is
supposed to quit. So how does an XML processor check your document? There are two main
checks that XML processors make: checking that your document is well-formed and checking
that it's valid.
(i) Taken as a whole, it matches the production labeled document. The document must follow
the production, must have three parts:
a prolog (XML declaration, PI, DTD)
a root element (which can contain other elements)
a miscellaneous part (unlike the preceding two parts, this part is optional)
(ii) It meets all the well-formedness constraints given in this specification (that is, the XML 1.0
specification, http://www.w3.org/TR/REC-xml). Every XML document, must be well-
formed. This means it must adhere to a number of rules, including the following:
o Begin the Document with an XML Declaration
o Use Only Legal Character References. Note the characters that are legal in XML 1.0 differ
somewhat from what's legal in XML 1.1.
SRM/MCA/HS 33
o Include at least one element (root element) and
o There must be exactly one root element.
o Using the Root Element to Contain All Other Elements
o Structure Elements Correctly.
Every start-tag must have a matching end-tag.
Elements may nest, but may not overlap.
Exercise 16: Simple example of well-formed documents with all the above points
XML processors assume that < starts a tag and & starts an entity reference, so you should
avoid using those characters for anything else.
o Make Attribute Names Unique. An element may not have two attributes with the same
name. (XML is case sensitive)
<message Text="Hi there!" text="Hello!">
o Avoid Entity References and < in Attribute Values. No unescaped < or & signs may occur
in the character data of an element or attribute
SRM/MCA/HS 34
o Comments and processing instructions may not appear inside tags.
Comments begin with <!-- and end with the first occurrence of -->. For example:
<!-- I need to verify and update these links when I get a chance. -->
Comments may appear anywhere in the character data of a document. They may also
appear before or after the root element. However, comments may not appear inside a tag
or inside another comment.
Example 1: Exercise 17: XML file along with cascading style sheet : Example for PI
Example 2: Processing instruction may also include program coding as shown below
<?php
mysql_connect("database.unc.edu", "clerk", "password");
$result = mysql("HR", "SELECT LastName, FirstName FROM Employees
ORDER BY LastName, FirstName");
$i = 0;
while ($i < mysql_numrows ($result)) {
$fields = mysql_fetch_row($result);
echo "<person>$fields[1] $fields[0] </person>\r\n";
$i++; } mysql_close( );
?>
(iii) Each of the parsed entities, which is referenced directly or indirectly within the document, is
well-formed.
SRM/MCA/HS 35
3.2 Valid XML documents
Well-formed XML data is guaranteed to use proper XML syntax, and a properly nested
(hierarchical) tree structure. This may be sufficient for relatively static internal applications,
particularly if the XML data is computer-generated and/or computer consumed. In this case, it's
the responsibility of the applications using the data to perform any structural or content
verification, error handling, and interpretation of the data. The XML structural information, and
the logic to do this, is usually hard-coded separately within the sending and receiving
applications, from a common specification. Therefore, any change to the XML data structure
must be made in three places: the specification, and the sending and receiving applications.
In addition to ensuring that XML data is well formed, many, if not most, XML applications will
also need to ensure that the data is valid XML. To do this, we need to:
Describe and validate the data structure, preferably in a rigorous and formal manner
Communicate this data structure to others - both applications and people
Constrain element content
Constrain attribute types and values, and perhaps provide default values
These functions could be handled by specific code within a pair of cooperating applications and
their accompanying documentation. However, in cases where the XML data is more widely
shared, say between multiple applications or users, maintaining these functions in each
application becomes an exponential nightmare. This is a problem common to most XML
applications, so ideally we'd like to take a more standardized approach. Separating the XML data
description from individual applications allows all cooperating applications to share a single
description of the data, known as the XML vocabulary. A group of XML documents that share a
common XML vocabulary is known as a document type, and each individual document that
conforms to a document type is a document instance
SRM/MCA/HS 36
3.2.1 XML DTD
A DTD is a set of declarations which can be incorporated within XML data, or exist as a separate
document. The DTD defines the rules that describe the structure and permissible content of the
XML data. Only one DTD may be associated with a given XML document or data object.
The most significant aspect of DTD validation is the definition of the structure of the hierarchical
tree of elements. A validating parser and a DTD can ensure that all necessary elements and
attributes are present in a document, and that there are no unauthorized elements or attributes.
This ensures that the data has a valid structure before it is handed over to the application. A DTD
can be used in conjunction with a validating parser to validate existing XML data or enforce
validity during the creation of XML documents by a human author, by:
The DTD lists all the elements, attributes, and entities the document uses and the contexts in
which it uses them. There are many things the DTD does not say. In particular, it does not say
the following:
DTD
identify elements and attributes
identify method for storing consistent data
Define meaningful structure of content of xml doc
DTD Creating is similar to table in database
Specify elements that can be present in xml document (columns)
XML documents that confirm to DTD are considered as valid documents.
The two types of DTD are : internal and external
o Internal DTD is part of XML file and cannot use across multiple document
o External DTD is a separate file and reference is included in XML document
SRM/MCA/HS 37
Exercise 18: Internal DTD implementation
SRM/MCA/HS 38
<name>
<lastname>Gable</lastname>
<firstname>Clark</firstname>
</name>
<hiredate>October 25, 2005</hiredate>
<projects>
<project>
<product>Keyboard</product>
<id>555</id>
<price>$129.00</price>
</project>
<project>
<product>Mouse</product>
<id>666</id>
<price>$25.00</price>
</project>
</projects>
</employee>
</document>
</document>
A parser is the most basic yet most important XML tool. Every XML application is based on a
parser. A parser is a software component that sits between the application and the XML files. Its
goal is to shield the developer from the intricacies of the XML syntax.
SRM/MCA/HS 39
How to parse and validate the XML along with DTD information?
Non-validating – the parser merely ensures that a data object is well-formed XML
Validating – the parser uses a DTD (or other type of schema) to ensure the validity of a
well-formed data object's form and content
Some parsers work as both types, with configuration switches that determine whether or not the
document will be validated.
We make use of MSXML parser, which uses XMLDOM object. To do validation using MSXML
parser, the following javascript coding is used.
<html>
<body>
<h3>
This demonstrates a parser error:
</h3>
<script type="text/javascript">
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.validateOnParse="true";
xmlDoc.load("ex18.xml");
</body>
</html>
SRM/MCA/HS 40
When XML file is having any invalid statement the output looks like:
The elements can be structured using DTD and checked against DTD as shown above. In a DTD,
elements are declared using the following syntax:
Using the container elements, one can precisely specify which other elements are allowed inside
an element, how often they may appear, and in what order. Example for element container model
for the given XML code snippet is as follows:
The following table lists the symbols used while specifying the element content in a DTD:
SRM/MCA/HS 41
The following table lists the different value types that can be specified for an attribute in a DTD:
An attribute called pid is declared for product element. The value type of attribute is set to ID,
which indicates it should be unique for each appearance of product element. Also, the
#REQUIRED indicates it is mandatory.
The category attribute is declared for the product element. The value type for the attribute is
enumerated list, in which default value is toy.
<?xml version="1.0"?>
<!DOCTYPE person [
<!ELEMENT person (name+, profession*)>
<!ELEMENT name EMPTY>
<!ATTLIST name first CDATA #REQUIRED
last CDATA #REQUIRED>
<!-- The first and last attributes are required to be present
but they may be empty. For example,
<name first="Cher" last=""> -->
<!ELEMENT profession EMPTY>
<!ATTLIST profession value CDATA #REQUIRED>
]>
SRM/MCA/HS 42
<person>
<name first="Alan" last="Turing"/>
<profession value="computer scientist"/>
<profession value="mathematician"/>
<profession value="cryptographer"/>
</person>
A CDATA attribute value can contain any string of text acceptable in a well-formed XML
attribute value. This is the most general attribute type.
When an XML processor parses an XML document, it interprets the markup in that document
and replaces entity references. When parsed, those characters will be interpreted as part of the
markup unless you convert them to < and &, which is called escaping them. To avoid
that, you can specify that you don't want the XML processor to parse part of your text data by
placing it in a CDATA section. CDATA stands for character data, as opposed to parsed character
data, which is PCDATA.
An XML schema is used to define the structure of an XML document. A schema defines the list
of elements and attributes that can be used in an XML document. In addition to the list of
elements, an XML schema also specifies the order in which these elements appear in the XML
document and their data types. To define the schema, Microsoft has developed XML schema
SRM/MCA/HS 43
definition languages (XSD). XML schemas have now become a W3C recommendation for
creating valid XML documents.
An XML Schema:
An XML schema created using XSD is very similar to a DTD, which is also used for defining
the structure of an XML document. However, an XML schema created using XSD has many
advantages over DTD. Some of them are:
XSD provides more control over the type of data that can be assigned to elements and
attributes as compared to DTD
DTD does not enable to define own customized data types.
XSD allows to specify restrictions on data. For example, one can ensure the content of an
element is a positive integer value.
The syntax for defining DTD is different from the syntax of XML document. However,
the XSD is the same as the syntax of XML.
XSD is also supported by variety of parsers.
SRM/MCA/HS 44
Exercise 22: XML schema
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="fullName" type="xs:string"/>
</xs:schema>
<?xml version="1.0"?>
<fullName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ex22.xsd">SRM University
</fullName>
Various XML schema creation tools: Hit Software, XmlArchitect, XMLspy, XRay, Microsoft
Visual studio.Net
Various XML schema validate tools: VS.Net, Topologi schematron validator, XML schema
quality checker, Xerces, XSD schema validator, XSV, Xerces J, IE
The entries should now be added to the appropriate drop-down menu. By installing these files,
entries will be added to the drop-down menu when you right-click on the browser window.
These entries will provide the following options:
Validate XML
View XSL Output
SRM/MCA/HS 45
The above Validation procedure is applied to Exercise 22:
Exercise 23: Create a XML schema for the existing XML file in .Net environment
Note: while choosing validate XML from .net, any error in xml file is blurred with yellow
SRM/MCA/HS 46
Exercise 24: HTML file which is used to validate xml file:
if (parser.load("ex22.xml")) {
document.write("The document is valid!"); }
else { if (parser.parseError.errorCode != 0) {
document.write(parser.parseError.reason); } }
</SCRIPT>
</HEAD> <BODY></BODY> </HTML>
Exercise 25: Validate the XML document with the given XSD:
XML file:
<?xml version="1.0"?>
<document xmlns="http://xmlpowercorp"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlpowercorp ex25.xsd">
<text>
Welcome to XML Schemas!
</text>
</document>
XSD file:
<?xml version="1.0"?>
<xsd:schema targetNamespace="http://xmlpowercorp"
xmlns="http://xmlpowercorp"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
attributeFormDefault="qualified" elementFormDefault="qualified">
<xsd:element name="document">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="text" type="xsd:string" minOccurs="1" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
SRM/MCA/HS 47
More about XML schema:
XML provides a list of predefined data types. These data types can be classified as:
XSD allows definition of custom data types: simple and complex data type.
Element declarations: XML documents are composed primarily of nested elements, and the
xs:element element is one of the most often-used declarations in a typical schema.
This declaration uses two attributes to describe the element that can appear in the instance
document: name and type. The name attribute is self-explanatory, but the type attribute requires
some additional explanation. The other few optional attributes to describe xs:element are
minOccurs, maxOccurs.
Exercise 26: Develop XML schema file for the following XML file:
SRM/MCA/HS 48
Step 1: Define simple data type for child elements
Step 2: Define complex data type for the parent element product
<xsd:element name=‖product‖/>
<xsd:complexType >
<xsd:sequence>
<xsd:element name=‖productname‖ type=‖xsd:string‖/>
<xsd:element name=‖desc‖ type=‖xsd:string‖/>
<xsd:element name=‖price‖ type=‖xsd:positiveInteger‖/>
<xsd:element name=‖quantity‖ type=‖xsd:nonNegativeInteger‖/>
</xsd:sequence>
</xsd:complexType>
Step 3: Define complex data type for root element with the integration Schema elements
<xsd:schema xmlns:xsd=‖http://www.w3.org/2001/XMLSchema‖>
</xsd:schema>
SRM/MCA/HS 49
Step 4: Include xmlschema file for processing in XML file:
<?xml version="1.0"?>
<productdata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ex26.xsd">
<product>
<productname> iPOD </productname>
<quantity> 50 </quantity>
</product>
</productdata>
SRM/MCA/HS 50
Additional Information: Students are informed to explore the various schema options given in
this example on their own:
XML file:
<?xml version="1.0"?>
<transaction borrowDate="2003-10-15">
<Lender phone="607.555.2222">
<name>Doug Glass</name>
<street>416 Disk Drive</street>
<city>Medfield</city>
<state>MA</state>
</Lender>
<Borrower phone="310.555.1111">
<name>Britta Regensburg</name>
<street>219 Union Drive</street>
<city>Medfield</city>
<state>CA</state>
</Borrower>
<note>Lender wants these back in two weeks!</note>
<books>
<book bookID="123-4567-890">
<bookTitle>Earthquakes for Breakfast</bookTitle>
<pubDate>2003-10-20</pubDate>
<replacementValue>15.95</replacementValue>
<maxDaysOut>14</maxDaysOut>
</book>
<book bookID="123-4567-891">
<bookTitle>Avalanches for Lunch</bookTitle>
<pubDate>2003-10-21</pubDate>
<replacementValue>19.99</replacementValue>
<maxDaysOut>14</maxDaysOut>
</book>
<book bookID="123-4567-892">
<bookTitle>Meteor Showers for Dinner</bookTitle>
<pubDate>2003-10-22</pubDate>
<replacementValue>11.95</replacementValue>
<maxDaysOut>14</maxDaysOut>
</book>
<book bookID="123-4567-893">
<bookTitle>Snacking on Volcanoes</bookTitle>
<pubDate>2003-10-23</pubDate>
<replacementValue>17.99</replacementValue>
<maxDaysOut>14</maxDaysOut>
</book>
</books>
</transaction>
SRM/MCA/HS 51
XSD file:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:annotation>
<xsd:documentation> Book borrowing transaction schema. </xsd:documentation>
</xsd:annotation>
<xsd:element name="transaction" type="transactionType"/>
<xsd:complexType name="transactionType">
<xsd:sequence>
<xsd:element name="Lender" type="address"/>
<xsd:element name="Borrower" type="address"/>
<xsd:element ref="note" minOccurs="0"/>
<xsd:element name="books" type="books"/>
</xsd:sequence>
<xsd:attribute name="borrowDate" type="xsd:date"/>
</xsd:complexType>
<xsd:element name="note" type="xsd:string"/>
<xsd:complexType name="address">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:NMTOKEN"/>
</xsd:sequence>
<xsd:attribute name="phone" type="xsd:string" use="optional"/>
</xsd:complexType>
<xsd:complexType name="books">
<xsd:sequence>
<xsd:element name="book" minOccurs="0" maxOccurs="10">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="bookTitle" type="xsd:string"/>
<xsd:element name="pubDate" type="xsd:date" minOccurs='0'/>
<xsd:element name="replacementValue" type="xsd:decimal"/>
<xsd:element name="maxDaysOut">
<xsd:simpleType>
<xsd:restriction base="xsd:integer">
<xsd:maxExclusive value="14"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:sequence>
<xsd:attribute name="bookID" type="catalogID"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<xsd:simpleType name="catalogID">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-\d{4}-\d{3}"/>
</xsd:restriction>
</xsd:simpleType> </xsd:schema>
SRM/MCA/HS 52
4. XML Presentation Technologies
Data is stored in XML document by using elements and attributes. XML focus on data sorage
not presentation of data. Rendering refers to the act of processing XML documents so that it can
be displayed on a variety of targets, such as web browsers, email pagers, and cell phones. XML
presentation technologies provide a modular way to deliver and display content to a variety of
devices. Here we examine some technologies for display, including CSS, XSL, Xforms, and
VoiceXML.
4.1 CSS
Cascading style sheets is an XML-supporting technology for adding style display properties such
as fonts, colors, or spacing to Web documents. CSS origins may be traced to the SGML world,
which used a style sheet technology called DSSSL to control the display of SGML documents.
Figure 2.10 shows, a style sheet tells a browser or other display engine how to display content.
Each rule is made up of a selector—typically an element name such as an HTML heading (H1) or
paragraph (P), or a user-defined XML element (Book)—and the style to be applied to the selector.
The CSS specification defines numerous properties (color, font style, point size, and so on) that
may be defined for an element. Each property takes a value which describes how the selector
should be presented.
SRM/MCA/HS 53
/* A bulleted list */
ingredient {display: list-item; list-style-position: inside }
XML file:
<directions>
<step>Sift flour, baking powder, sugar & salt together.</step>
<step>Add 1 cup corn meal.</step>
<step>
Beat egg in cup and add beaten egg and 11/2 cups whole
milk to make a batter. Stir well.
</step>
<step>
Add melted shortening and beat until light and thoroughly mixed.
</step>
SRM/MCA/HS 54
<step>
Pour into greased shallow pan or greased muffin rings.
</step>
<step>
Bake in hot oven at <temperature>425 F</temperature> for
<duration>25 minutes</duration>.
</step>
<step optional="yes">
Cut into squares if cooked in shallow pan.
</step>
</directions>
<story>
This food is well prepared by <person> III MCA students </person> died,
Many persons used to like this in <city> Chennai </city>,
<state> Tamil Nadu </state>.
</story>
</recipe>
When the properties to be set for more than one element at the same time, it can be combined as
a single group. For example, if quantity and component both to be displayed in red color,
SRM/MCA/HS 55
4.2 XSL Formating Objects (XSL-FO)
XSL 1.0 is a W3C Recommendation that provides users with the ability to describe how XML
data and documents are to be formatted. XSL does this by defining "formatting objects," such as
footnotes, headers, or columns.
An XSL style sheet is basically a series of pattern-action rules and looks like an XML document
with a mixture of two kinds of elements: those defined by XSL and those defined by the object
language. The patterns are similar to CSS's selectors, but the action part may create an arbitrary
number of "objects." The action part of the rule is called the "template" in XSL, and a template
and a pattern together are referred to as a "template rule."
The result of applying all matching patterns to a document recursively is a tree of objects, which
is then interpreted top-down according to the definition of each object. For example, if they are
HTML objects, an HTML document will be generated; if they are XML objects, XML will be
the result.
CSS XSL
Can be used with HTML? yes no
Can be used with XML? yes yes
Transformation language? no yes
Syntax CSS XML
Figure 2.11 illustrates some of the different options for using CSS and XSL to create displays
based on HTML or XML. The general principle is that if the document is to be simply rendered
and not transformed in any way through the addition or deletion of items, then CSS is the more
straightforward approach.
SRM/MCA/HS 56
Exercise 28: Implement XSL formatting
XML file:
<?xml version="1.0" encoding ="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="ex28.xsl"?>
<states>
<state>
<name>California</name>
<population units="people">33871648</population><!--2000 census-->
<capital>Sacramento</capital>
<bird>Quail</bird>
<flower>Golden Poppy</flower>
<area units="square miles">155959</area>
</state>
<state>
<name>New York</name>
<population units="people">18976457</population><!--2000 census-->
<capital>Albany</capital>
<bird>Bluebird</bird>
<flower>Rose</flower>
<area units="square miles">47214</area>
</state>
</states>
XSL file:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="states">
<HTML>
<BODY>
<xsl:apply-templates/>
</BODY>
</HTML>
</xsl:template>
<xsl:template match="state">
<P> <b><i> <xsl:value-of select="name"/> </b></i> </P>
</xsl:template>
</xsl:stylesheet>
SRM/MCA/HS 57
4.3 XFORMS
XForms is an XML approach that overcomes the limitations of HTML forms. XForms is a GUI
toolkit for creating user interfaces and delivering the results in XML. Figure 2.12 illustrates how
a single device-independent XML form definition, called the XForms Model, has the capability
to work with a variety of standard or proprietary user interfaces. For example, the Voice Browser
Working Group is looking at developing voice-based user interface components for XForms.
<body>
<h1>Using XForms</h1> <p>Input Control</p>
<xforms:input ref="/data/input"></xforms:input>
<p>Select Control</p>
<xforms:select appearance="full" ref="/data/select">
<xforms:item>
<xforms:value>1</xforms:value>
<xforms:label>Item 1</xforms:label>
</xforms:item>
<xforms:item>
<xforms:value>2</xforms:value>
SRM/MCA/HS 58
<xforms:label>Item 2</xforms:label>
</xforms:item>
<xforms:item>
<xforms:value>3</xforms:value>
<xforms:label>Item 3</xforms:label>
</xforms:item>
</xforms:select>
<p>Button</p>
<xforms:trigger>
<xforms:label>Click Me</xforms:label>
<xforms:message ev:event="click" level="ephemeral"
ref="/data/message"/>
</xforms:trigger>
<p>Select Boolean</p>
<xforms:selectboolean ref="/data/selectboolean">
<xforms:label>Click Me</xforms:label>
</xforms:selectboolean>
<p>Submit and Reset Buttons</p>
<xforms:submit>
<xforms:label>Submit</xforms:label>
</xforms:submit>
<xforms:trigger>
<xforms:label>Reset</xforms:label>
<xforms:reset ev:event="DOMActivate"/>
</xforms:trigger>
</body>
</html>
4.4 XHTML
The capability of XHTML to be more flexible than HTML is attributable to the use of XHTML
modules for creating XHTML-conforming markup languages. New XHTML-compliant
languages must use the basic XHTML framework as well as other XHTML modules. As
illustrated in Figure 2.13, modules plug together within the XHTML framework to define a
markup language that is task or client specific. Documents developed based on the new markup
language will be usable on any XHTML-conforming clients.
SRM/MCA/HS 59
4.5 VoiceXML
VoiceXML is an emerging standard for speech-enabled applications. Its XML syntax defines
elements to control a sequence of interaction dialogs between a user and an implementation
platform. The elements defined as part of VoiceXML control dialogs and rules for presenting
information to and extracting information from an end-user using speech. Figure 2.14 illustrates,
VoiceXML documents are stored on Web servers. Translation from text to voice is carried out
either on a specialized server that delivers voice directly to a phone or by the device itself using
speech processing technology.
XML documents excel at storing data, and this has led developers to wonder if XML will
ultimately be able to solve an old problem: being able to directly compare and classify the data in
multiple documents. For example, consider the World Wide Web as it stands today: There can be
thousands of documents on a particular topic, but how can you possibly compare them? For
example, a search for the term XML turns up millions of matches, but it would be extraordinarily
difficult to write a program that would compare the data in those documents because all that data
isn't stored in any remotely compatible format.
The idea behind XML information sets, also called infosets, is to set up an abstract way of
looking at an XML document so that it can be compared to others. To have an infoset, XML
documents may not use colons in tag and attribute names unless they are used to support
SRM/MCA/HS 60
namespaces. Documents do not need to be valid to have an infoset, but they need to be well
formed.
An XML document's information set consists of two or more information items (the information
set for any well-formed XML document contains at least the document information item and one
element information item). An information item is an abstract representation of some part of an
XML document, and each information item has a set of properties, some of which are considered
core and some of which are considered peripheral.
Although infosets are a good idea, they are only abstract formulations of the information in an
XML document. So without reducing an XML document to its infoset, how can you actually
approach the goal of being able to actually compare XML documents byte by byte?. It turns out
that there is a way: You can use canonical XML. Canonical XML is a companion standard to
XML. The canonical XML syntax is very strict; for example, canonical XML uses UTF-8
character encoding only, carriage-return linefeed pairs are replaced with linefeeds, tabs in
CDATA sections are replaced by spaces, all entity references must be expanded, and much more,
as specified in www.w3.org/TR/xml-c14n. Because canonical XML is intended to be byte-by-
byte correct, the upshot is that if you need a document in canonical form, you should use
software to convert your XML documents to that form.
One such package that will convert valid XML documents to canonical form comes with the
XML for Java software that you can get from IBM's AlphaWorks (the Web site is
http://www.alphaworks.ibm.com/tech/xml4j).
SRM/MCA/HS 61
6. XML Transformation
The transformation language lets you transform the structure of documents into different forms
(such as PDF, WML, HTML, or another schema type), while the formatting language actually
formats and styles documents in various ways. These two parts of XSL can function quite
independently, and you can think of XSL as two languages, not one. In practice, you often
transform a document before formatting it.
On the server— A server program, such as a Java servlet or a JavaServer Page (JSP), can
use a stylesheet to transform a document automatically and serve it to the client. One
such example is the XML Enabler, which is a servlet you'll find at the XML For Java
Web site, www.alphaworks.ibm.com/tech/xml4j.
On the client— A client program, such as a browser, can perform the transformation,
reading in the stylesheet that you specify with the <?xml-stylesheet?> processing
instruction. Internet Explorer can handle transformations this way, to some extent.
With a separate program —Several standalone programs, usually based on Java, will
perform XSLT transformations.
6.1 XSLT
SRM/MCA/HS 62
Comparison between XSLT and CSS:
CSS XSLT
Simple to use and suitable for simple Complex to use. It is a superset of CSS
documents functionality.
Cannot reorder, add, delete, or perform Can render, add, delete, elements since it is
operations on elements aware of the structure of XML document
Does not offer access to non-elements, such as Able to access and manipulate the comments,
attributes and their values and processing processing instructions, and attribute values
instructions. and names with an XML document
Uses less memory since it cannot reorder a Uses more memory and processor power since
document and therefore, does not need to build reordering, adding, deleting and manipulating
a tree representation of the document. elements require a tree representation of the
document in the memory.
Uses a different syntax than XML Written using XML and therefore has the same
syntax as that of XML
MSXML parser
Working of XSLT
XSLT Tree
Xslt style
sheet
XSLT parser Result tree
Xml doc
Source Tree
SRM/MCA/HS 63
The XSLT processor comes packaged along with MSXML parser. Since XSLT is an application
of XML, the MSXML parser is also used to parse an XSLT document. The MSXML parser
parses the XSLT stylesheet and creates a tree structure based on the elements and attributes used
in an XSLT document. This tree is referred to as the XSLT tree.
The XSLT processor component of the MSXML parser takes the transformation information
contained in the XSLT stylesheet, applies it to the data retrieved from the source document, and
builds a resultant tree structure referred to as the result tree. This tree is then rendered to various
targets, such as web browsers, pagers, and cell phones.
The stylesheet element is the root element of all XSLT stylesheets. The xsl prefix contains a
reference to the namespace-URI for XSLT.
value-of element
<xsl:value-of select=‖productname‖ />
The value-of-element is an empty element. It is used to represent the name of the element or
attribute whose value is to be displayed. To display the value of attribute @ symbol should be
used: <xsl:value-of select=‖@category‖ />
for-each element: This element is used to instruct the XSLT processor to process the information
for each instance of the specified pattern. Parent/Child pattern:
<xsl:for-each select=‖productdata/product‖>
<font color=‖blue‖> <xsl:value-of select=‖productname‖ /> </font>
</xsl:for-each>
SRM/MCA/HS 64
XSLT template rules:
A template rule describes how an XML element and its contents are converted into a format that
can be displayed in the browser. A template rule consists of two parts:
The template element: It is used to define a template for desired output. The syntax is as follows:
<xsl:template match=‖pattern‖>
[action to be taken]
</xsl:template>
This element is used to instruct the XSLT processor to find an appropriate template and perform
the specified tasks on each selected element. The syntax for using this element is as follows:
<xsl:apply-templates [select=‖pattern‖]>
The select attribute is optional and is used to specify the context in which the template should be
executed. The default value for this attribute is ―node()‖, which means that the template should
be executed for the children of the current node. The apply-templates element directs the XSLT
processor to find an appropriate template to apply.
SRM/MCA/HS 65
Exercise 29: XSLT transformation with various options
XML file:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="ex29a.xsl"?>
<states>
<state>
<name>California</name>
<population units="people">33871648</population><!--2000 census-->
<capital>Sacramento</capital>
<bird>Quail</bird>
<flower>Golden Poppy</flower>
<area units="square miles">155959</area>
</state>
<state>
<name>New York</name>
<population units="people">18976457</population><!--2000 census-->
<capital>Albany</capital>
<bird>Bluebird</bird>
<flower>Rose</flower>
<area units="square miles">47214</area>
</state>
</states>
Ex29a.XSL file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
Ex29b.XSL file:
SRM/MCA/HS 66
Exercise 30: Implement for-each and sort element option:
<?xml version=‖1.0‖?>
<?xml-stylesheet type="text/xsl" href="ex30.xsl"?>
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget to watch FIFA!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
Ex30.xsl:
<xsl:template match="/">
<xsl:for-each select=‖messages/note‖>
SRM/MCA/HS 67
Exercise 31: Make use of if statement and choose-when
<projects>
<project>
<product>Printer</product>
<id>111</id>
<price>111</price>
</project>
<project>
<product>Laptop</product>
<id>222</id>
<price>989</price>
</project>
<project>
<product>Keyboard</product>
<id>555</id>
<price>129</price>
</project>
<project>
<product>Mouse</product>
<id>666</id>
<price>25</price>
</project>
</projects>
SRM/MCA/HS 68
Ex31b.xsl : using when choose statement
<xsl:choose>
<xsl:when test=‖price[. < 500]‖>
<font color= ―red‖> <xsl:value-of select="product" /> </font>
</xsl:when>
<xsl:otherwise>
<font color= ―green‖> <xsl:value-of select="product" /> </font>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Note: The XSLT if element provides a simple if-then construct. The choose element is used to
make a choice when there are two or more possible course of action. It provides a means for
testing multiple conditions.
SRM/MCA/HS 69
Additional information to explore students by their own:
On Java Server side (JSP) transformation is having the following Code snippet.
<%
try
{
TransformerFactory transformerfactory = TransformerFactory.newInstance();
}
catch(Exception e) {}
String instring;
while((instring = bufferedreader.readLine()) != null) {
%>
<%= instring %>
<%
}
filereader.close();
%>
import javax.xml.transform.*;
import javax.xml.transform.stream.*;
import java.io.*;
SRM/MCA/HS 70
6.2 XPATH
The XML 1.0 syntax provides a straightforward, standard way to exchange information between
computer programs. The XML Path Language, XPath, plays an important part in such exchange
of information between computers or computer applications. XPath is used to navigate XML tree
structures. XPath gets its name from its use of a path notation to navigate through the
hierarchical tree structure of an XML document. Because all XML documents can be represented
as a tree of nodes, XPath allows for the selection of a node or group of nodes through the use of a
compact, non-XML syntax. It is an important XML technology due to its role in providing a
common syntax and semantics for functionality in both XSLT and XPointer.
Figure 2.16 shows that XPath operates on the hierarchical tree structure of an XML document
rather than its tag-based syntax. It is capable of distinguishing between different types of nodes,
including element nodes, attribute nodes, and text node. XPath is designed to enable addressing
of, or navigation to, chosen parts of an XML document. In support of that aim, XPath provides a
number of functions for the manipulation of strings, numbers, Booleans, and nodesets.
SRM/MCA/HS 71
Ex32.xsl : XPATH
<xsl:for-each select="purchase/product">
<xsl:for-each select="order">
<b> order id: </b> <xsl:value-of select=‖@id‖/> <br/>
quantity <xsl:value-of select=‖@quantity‖/> <br/>
Order value <xsl:value-of select=‘(../@price)*(quantity)‘/> <br/>
</xsl:for-each>
<HR/>
SRM/MCA/HS 72
Exercise 33: Data in tabular form
Ex33.xsl
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" omit-xml-declaration="yes" />
<xsl:template match="/">
<html> <head> <title> Sales Report, <xsl:value-of select="/SalesReport/Company" />:
<xsl:value-of select="/SalesReport/Period" /> </title> </head>
<xsl:for-each select="/SalesReport/Sales">
<tr> <td align="center"><xsl:value-of select="@Region"/></td>
<td align="center"><xsl:value-of select="."/></td> </tr>
</xsl:for-each>
SRM/MCA/HS 73
6.3 XLINK
XLink will enable bidirectional Web linking. The notion of resources is universal to the World
Wide Web. According to the Internet Engineering Task Force, a "resource" is any addressable
unit of information or service. Examples include files, images, documents, programs, and query
results. These resources are addressed using a URI reference. What XLink brings to the table is
the ability to address a portion of a resource. For example, if the entire resource is an XML
document, a useful portion of that resource might be a single element within the document.
ex34.css
6.4 XPOINTER
When you use XLinks, you can link to a particular document, but many times, you want to be
more precise than that. XPointers let us point to specific locations inside a document, and they
are coming into more common use.
The XML Pointer language, XPointer, is intended as a provider of fragment identifiers for XML
documents. Expressed more formally, XPointer is the fragment identifier language for resources
whose type is one of text/xml, application/xml, text/xmlexternal-parsed-entity, or
application/xml-external-parsed-entity.
SRM/MCA/HS 74
Consider the code snippet:
<list ID="MyList">
<item> First </item>
<item>Second</item>
<item>Third </item>
</list>
If we want to refer to what XPointer terms a singleton location-set, consisting of the second of
the items within the <list> element, we would use the following XPointer:
xpointer(id('MyList')/item[1]/range-to(following-sibling::item[2]))
6.6 XQUERY
XQuery is a W3C initiative to define a standard set of constructs for querying and searching
XML documents. The XML Query Working Group draws its membership from both the
document and the database communities, trying to hammer out an agreement on XML-based
query syntax that meets the needs of both cultures.
XQuery is a standardized language that can be used to query XML documents much as SQL is
used to query relational database tables. Essentially, XQuery consists of a data model and a set of
query operators that operate on that data model.
The following XQUERY predicate is used to select all the book elements under the bookstore
element that have a price element with a value that is less than 30:
The doc function in XPATH:
doc("books.xml")/bookstore/book[price<30]
Equivalent XQUERY statement for the above XPATH statement is
for $i in doc("books.xml")/bookstore/book
where $i/price<30
order by $i/title
return $i/title
SRM/MCA/HS 75