0% found this document useful (0 votes)
10 views

Cs 311

The document discusses the key concepts and syntax rules of XML including the structure of XML documents, elements, attributes, namespaces, DTDs, XSLT, XPath and XLink. It provides detailed explanations and examples of each concept.

Uploaded by

aliflam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Cs 311

The document discusses the key concepts and syntax rules of XML including the structure of XML documents, elements, attributes, namespaces, DTDs, XSLT, XPath and XLink. It provides detailed explanations and examples of each concept.

Uploaded by

aliflam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

 All XML documents must contain a single tag pair to define the root element.

 All other elements must be nested within the root element. All elements can have sub
(children) elements.
 Sub elements must be in pairs and correctly nested within their parent element
 The XML declaration is case sensitive and must begin with "<?xml>" where "xml" is
written in lower-case.
 The XML declaration strictly needs be the first statement in the XML document.
 An HTTP protocol can override the value of encoding that you put in the XML
declaration
 Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are
considered two different XML attributes.
 Same attribute cannot have two values in a syntax. The following example shows in
correct syntax because the attribute b is specified twice:
<a b="x" c="y" b="z">....</a>
 Attribute names are defined without quotation marks, whereas attribute values must
always appear in quotation marks. Following example demonstrates incorrect xml syntax:
<a b=x>....</a>
 To avoid character encoding problems, all XML files should be saved as Unicode UTF-8
or
UTF-16 files.
 Whitespace characters like blanks, tabs and line-breaks between XML-elements and
between the XML-attributes will be ignored.
 The purpose of a DTD (Document-Type-Information) is to define the legal building
blocks of an XML document.
 DTD (Document-Type-Information) defines the document structure with a list of legal
elements.
 A DTD can be declared inline in your XML document, or as an external reference.
 !ELEMENT note (in line 2) defines the element "note" as having four elements:
"to,from,heading,body"
 XML provides an application independent way of sharing data.
 With a DTD, independent groups of people can agree to use a common DTD for
interchanging data. Your application can use a standard DTD to verify that data that you
receive from the outside world is valid. You can also use a DTD to verify your own data.
 A lot of forums are emerging to define standard DTDs for almost everything in the areas
of data exchange.
 XML Namespaces provide a method to avoid element name conflicts.
 In XML, element names are defined by the developer. This often results in a conflict
when trying to mix XML documents from different XML applications.
 conflict. Both contain a<table> element, but the elements have different content and
meaning.
 Name conflicts in XML can easily be avoided using a name prefix
 XML allows sets of documents which are all the same type to be created and handled
consistently and without structural errors
 XML provides a common syntax for messaging systems for the exchange of information
between applications.
 If everyone uses the same syntax it makes writing these systems much faster and more
reliable.
 XML is free. It doesn't belong to anyone, so it can't be hijacked or pirated. And you don't
have to pay a fee to use it.
 XML information can be manipulated programmatically so XML documents can be
pieced together from disparate sources, or taken apart and re-used in different ways. They
can be converted into any other format with no loss of information.
 XML can also be used to store data in files or in databases. Applications can be written to
store and retrieve information from the store, and generic applications can be used to
display the data.
 In many HTML applications, XML is used to store or transport data, while HTML is
used to format and display the same data.
 When displaying data in HTML, you should not have to edit the HTML file when the
data changes.
 With a few lines of JavaScript code, you can read an XML file and update the data
content of any HTML page.
 XML documents are formed as element trees.
 An XML tree starts at a root element and branches from the root to child elements.
 All elements can have sub elements (child elements)
 The terms parent, child, and sibling are used to describe the relationships between
elements.
 Parent have children. Children have parents. Siblings are children on the same level
(brothers and sisters).
 An XML element is everything from (including) the element's start tag to (including) the
element's end tag.
 <title>, <author>, <year>, and <price> have text content because they contain text (like
29.99). <bookstore> and <book> have element contents, because they contain elements.
<book> has an attribute (category="children")
 XML elements can be defined as building blocks of an XML. Elements can behave as
containers to hold text, elements, attributes, media objects or all of these.
 Each XML document contains one or more elements, the scope of which are either
delimited by start and end tags, or for empty elements, by an empty-element tag.
 An element with no content is said to be empty.

 In XML, you can indicate an empty element like this: <element></element>


You can also use a so called self-closing tag: <element />
 The name its case in the start and end tags must match.
 An attribute defines a property of the element.
 An element name can contain any alphanumeric characters. The only punctuation mark
allowed in names are the hyphen (-), under-score (_) and period (.).
 Names are case sensitive. For example, Address, address, and ADDRESS are different
names.
 Start and end tags of an element must be identical.
 Attributes are part of the XML elements. An element can have multiple unique
attributes.
 Attribute gives more information about XML elements. To be more precise, they define
properties of elements.
 An XML attribute is always a name-value pair.
 Attributes are used to add a unique label to an element, place the label in a category, add
a Boolean flag, or otherwise associate it with some string of data.
 Attributes are used to distinguish among elements of the same name. When you do not
want to create a new element for every situation. Hence, use of an attribute can add a
little more detail in differentiating two or more similar elements.
 An attribute name must not appear more than once in the same start-tag or empty-
element tag.
 An attribute must be declared in the Document Type Definition (DTD) using an
Attribute List Declaration.
 Attribute values must not contain direct or indirect entity references to external entities.
 The replacement text of any entity referred to directly or indirectly in an attribute value
must not contain either less than sign <
 Most browsers will display an XML document with color-coded elements.
 Often a plus (+) or minus sign (-) to the left of the elements can be clicked to expand or
collapse the element structure.
 XSL = Style Sheets for XML
 XSLT is the most important part of XSL
 XSLT is used to transform an XML document into another XML document, or another
type of document that is recognized by a browser, like HTML and XHTML.
 Normally XSLT does this by transforming each XML element into an (X)HTML
element.
 With XSLT you can add/remove elements and attributes to or from the output file. You
can also rearrange and sort elements, perform tests and make decisions about which
elements to hide and display, and a lot more.
 A common way to describe the transformation process is to say that XSLT transforms
an XML source-tree into an XML result-tree.
 <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
<Article>
<Title>My Article</Title>
<Authors>
<Author>Mr. Foo</Author>
<Author>Mr. Bar</Author>
</Authors>
<Body>This is my article text.</Body>
</Article>
XSL Stylesheet (example.xsl):
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
Article - <xsl:value-of select="/Article/Title"/>
Authors: <xsl:apply-templates select="/Article/Authors/Author"/>
</xsl:template>
<xsl:template match="Author">
- <xsl:value-of select="." />
</xsl:template>
Introduction to Web Services Development (CS311)
</xsl:stylesheet>

Browser Output:
Article - My Article
Authors:
- Mr. Foo
- Mr. Bar
 XPath is used to navigate through elements and attributes in an XML document.
XPath is a syntax for defining parts of an XML document. It uses path expressions to
navigate in
XML documents.
 XPath contains a library of standard functions.
 XPath is a major element in XSLT.
 XPath is a W3C recommendation
 XPath uses path expressions to select nodes or node-sets in an XML document. These
path expressions look very much like the expressions you see when you work with a
traditional computer file system.
 Without XPath knowledge you will not be able to create XSLT documents.
 XPath is also used in XQuery, XPointer and XLink
 Xml linking language, or xlink is an xml markup language and w3c specification that
provides methods for creating internal and external links within xml documents, and
associating metadata with those links.
 xlink provides a framework for creating both basic unidirectional links and more
complex linking structures.
 xlink is short for the xml linking language
 xlink is a language for creating hyperlinks in xml documents
 xlink is similar to html links - but it is a lot more powerful
 xlink supports simple links(like html link system) and extended links (for linking
multiple"more then one" resources together)
 with xlink, the links can be defined outside of the linked files
 xlink is a 'w3c recommendation'

Defines when the linked


resource is read and shown:
onLoad - the resource
onLoad onRequest
should be loaded and shown
xlink:actuate other
when the document loads
none
onRequest - the resource
is not read or shown before
the link is clicked
xlink:href URL Specifies the URL to link to
embed
new
Specifies where to open the
xlink:show replace
link. Default is "replace"
other
none
simple
extended
locator
xlink:type arc
resource
title
none

 xpointer is a system for addressing components of xml base internet media.


 xpointer language is divided among four specifications:
a 'framework' which forms the basis for identifying xml fragments,
a positional element addressing scheme,
a scheme for namespaces,
a scheme for xpath-based addressing.
 There is no browser support for XPointer. But XPointer is used in other XML
languages.
 xpointer is short for the xml pointer language
 xpointer uses xpath expressions to navigate in the xml document.
 xpointer is a w3c recommendation

 XML validation is the process of checking a document written in XML (eXtensible


Markup Language) to confirm that it is both well-formed and also "valid" in that it
follows a defined structure.
 An XML document is said to be valid if its contents match with the elements, attributes
and associated document type declaration (DTD), and if the document complies with the
constraints expressed in it.
 Validation is dealt in two ways by the XML parser. They are:
Well-formed XML document
Valid XML document
 An XML document is said to be well-formed if it adheres to the following rules:
Non DTD XML files must use the predefined character entities for amp(&),
apos(single quote), gt(>), lt(<), quot(double quote).
It must follow the ordering of the tag. i.e., the inner tag must be closed before closing
the outer tag.
Each of its opening tags must have a closing tag or it must be a self ending
tag.(<title>....</title> or <title/>).
It must have only one attribute in a start tag, which needs to be quoted.
amp(&), apos(single quote), gt(>), lt(<), quot(double quote) entities other than these
must be declared.
 If an XML document is well-formed and has an associated Document Type Declaration
(DTD), then it is said to be a valid XML document.
 The XML Document Type Declaration, commonly known as DTD, is a way to
describe XML language precisely.
 DTDs check vocabulary and validity of the structure of XML documents against
grammatical rules of appropriate XML language.
An XML DTD can be either specified inside the document, or it can be kept in a separate
document and then liked separately.
 The DTD starts with <!DOCTYPE delimiter.
 An element tells the parser to parse the document from the specified root element.
 DTD identifier is an identifier for the document type definition, which may be the path
to a file on the system or URL to a file on the internet. If the DTD is pointing to external
path, it is called External Subset.
 The square brackets [ ] enclose an optional list of entity declarations called Internal
Subset.
 Elements are the main building blocks of both XML and HTML documents.
 Examples of empty HTML elements are "hr", "br" and "img’"
 Attributes provide extra information about elements.
 Attributes are always placed inside the opening tag of an element.
 HTML entity: “&nbsp ” This "no-breaking-space" entity is used in HTML to insert an
extra space in a document.
 Entities are expanded when a document is parsed by an XML parser
 PCDATA means parsed character data.
 PCDATA is text that WILL be parsed by a parser. The text will be examined by the
parser for entities and markup. Tags inside the text will be treated as markup and entities
will be expanded.
 parsed character data should not contain any &, <, or > characters; these need to be
represented by the & < and > entities, respectively.
 CDATA means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be
treated as markup and entities will not be expanded
 In a DTD, elements are declared with an ELEMENT declaration.
 Empty elements are declared with the category keyword EMPTY
 Elements with only parsed character data are declared with #PCDATA inside parentheses
 Elements declared with the category keyword ANY, can contain any combination of
parsable data
 Elements with one or more children are declared with the name of the children elements
inside parentheses
 The + sign in the example above declares that the child element "message" must occur
one or more times inside the "note" element.
 In a DTD, attributes are declared with an ATTLIST declaration
 <!ATTLIST payment type CDATA "check"> in dtd
 CDATA The value is character data
 (en1|en2|..) The value must be one from an enumerated list
 ID The value is a unique id
 IDREF The value is the id of another element
 IDREFS The value is a list of other ids
 NMTOKEN The value is a valid XML name
 NMTOKENS The value is a list of valid XML names
 ENTITY The value is an entity
 ENTITIES The value is a list of entities
 NOTATION The value is a name of a notation
 xml: The value is a predefined xml value
 Enumerated attribute values are used when you want the attribute value to be one of a
fixed set of legal values
 Data can be stored in child elements or in attributes..
 Use child elements if the information feels like data
 I like to store data in child elements.
 <note>
<date>
<day>12</day>
<month>11</month>
<year>2002</year>
</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Some of the problems with attributes are:


attributes cannot contain multiple values (child elements can)
attributes are not easily expandable (for future changes)
attributes cannot describe structures (child elements can)
attributes are more difficult to manipulate by program code
attribute values are not easy to test against a DTD

 Use attributes only to provide information that is not relevant to the data
 Entities are used to define shortcuts to special characters.
 Entities can be declared internal or external
 An entity has three parts: an ampersand (&), an entity name, and a semicolon (;).
 An XML Schema describes the structure of an XML document.
 The purpose of an XML Schema is to define the legal building blocks of an XML
document:
the elements and attributes that can appear in a document
the number of (and order of) child elements
data types for elements and attributes
default and fixed values for elements and attributes
 The <schema> element is the root element of every XML Schema
 The <schema> element may contain some attributes.
 XML can easily be stored and generated by a standard web server.
 XML files can be stored on an Internet server exactly the same way as HTML files
 XML can be generated on a server without any installed XML software.
 The content type of the response header must be set to "text/xml".
 To generate an XML response from the server - simply write the following code and
save it as an ASP file on the web server
 XML can be generated from a database without any installed XML software.
 To generate an XML database response from the server, simply write the following
code and save it as an ASP file on the web server

You might also like