SlideShare a Scribd company logo
XML
2
XML is not…
• A replacement for HTML
• A presentation format
• A programming language
• A network transfer protocol
• A database
3
But then – what is it?
XML is a markup language for
text documents / textual data
XML allows to define languages
(“applications“) to represent text
documents / textual data
4
XML by Example
<article>
<author>Gerhard Weikum</author>
<title>The Web in 10 Years</title>
</article>
• Easy to understand for human users
• Very expressive (semantics along with the data)
• Well structured, easy to read and write from programs
5
XML by Example
<t108>
<x87>Gerhard Weikum</x87>
<g10>The Web in 10 Years</g10>
</t108>
• Hard to understand for human users
• Not expressive (no semantics along with the data)
• Well structured, easy to read and write from programs
6
Possible Advantages of Using XML
• Truly Portable Data
• Easily readable by human users
• Very expressive (semantics near data)
• Very flexible and customizable (no finite tag set)
• Easy to use from programs
• Easy to convert into other representations
(XML transformation languages)
• Many additional standards and tools
• Widely used and supported
7
App. Scenario 1: Content Mgt.
Database with
XML documents
Clients
ConvertersXML2HTML XML2WML XML2PDF
8
App. Scenario 2: Data Exchange
Legacy
System
(e.g., SAP
R/2)
Legacy
System
(e.g.,
Cobol)
XML
Adapter
XML
Adapter
XML
(BMECat, ebXML, RosettaNet, BizTalk, …)
SupplierBuyer
Order
9
App. Scenario 3: XML for Metadata
<rdf:RDF
<rdf:Description rdf:about="http://www-dbs/Sch03.pdf">
<dc:title>A Framework for…</dc:title>
<dc:creator>Ralf Schenkel</dc:creator>
<dc:description>While there are...</dc:description>
<dc:publisher>Saarland University</dc:publisher>
<dc:subject>XML Indexing</dc:subject>
<dc:rights>Copyright ...</dc:rights>
<dc:type>Electronic Document</dc:type>
<dc:format>text/pdf</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>
10
App. Scenario 4: Document Markup
<article>
<section id=„1“ title=„Intro“>
This article is about <index>XML</index>.
</section>
<section id=„2“ title=„Main Results“>
<name>Weikum</name> <cite idref=„Weik01“/> shows
the following theorem (see Section <ref idref=„1“/>)
<theorem id=„theo:1“ source=„Weik01“>
For any XML document x, ...
</theorem>
</section>
<literature>
<cite id=„Weik01“><author>Weikum</author></cite>
</literature>
</article>
11
App. Scenario 4: Document Markup
• Document Markup adds structural and semantic
information to documents, e.g.
– Sections, Subsections, Theorems, …
– Cross References
– Literature Citations
– Index Entries
– Named Entities
12
XML
Part 2 – Basic XML Concepts
13
XML Documents
What‘s in an XML document?
• Elements
• Attributes
• plus some other details
14
A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
15
A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
Freely definable tags
16
Element
Content of
the Element
(Subelements
and/or Text)
A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
End Tag
Start Tag
17
A Simple XML Document
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
Attributes with
name and value
18
Elements in XML Documents
• (Freely definable) tags: article, title, author
– with start tag: <article> etc.
– and end tag: </article> etc.
• Elements: <article> ... </article>
• Elements have a name (article) and a content (...)
• Elements may be nested.
• Elements may be empty: <this_is_empty/>
• Element content are strings with special characters, and/or nested
elements (mixed content if both).
• Each XML document has exactly one root element and forms a
tree.
• Elements with a common parent are ordered.
19
Elements vs. Attributes
Elements may have attributes (in the start tag) that have a name
and
a value, e.g. <section number=“1“>.
What is the difference between elements and attributes?
• Only one attribute with a given name per element
• Attributes have no structure, simply strings (while elements can
have subelements)
As a rule of thumb:
• Content into elements
• Metadata into attributes
Example:
<person born=“1912-06-23“ died=“1954-06-07“>
Alan Turing</person> proved that…
20
XML Documents as Ordered Trees
article
author title text
sectionabstract
The index
Web
provides …
title=“…“
number=“1“
In order …
Gerhard
Weikum
The Web
in 10 years
22
Well-Formed XML Documents
A well-formed document must adher to, among others, the
following rules:
• Every start tag has a matching end tag.
• Elements may nest, but must not overlap.
• There must be exactly one root element.
• Attribute values must be quoted.
• An element may not have two attributes with the same
name.
• Comments and processing instructions may not appear
inside tags.
24
Namespace
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>
Unique URI to identify
the namespace
Signal that namespace
definition happens
Prefix as abbrevation
of URI
25
Namespace Example
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>
<dbs:description> ... </dbs:description>
<dbs:text>
<dbs:formula>
<mathml:math
xmlns:mathml=“http://www.w3.org/1998/Math/MathML“>
...
</mathml:math>
</dbs:formula>
</dbs:text>
</dbs:book>
26
Default Namespace
• Default namespace may be set for an element and its
content (but not its attributes):
<book xmlns=“http://www-dbs/dbs“>
<description>...</description>
<book>
• Can be overridden in the elements by specifying the
namespace there (using prefix or default namespace)
27
XML
Part 3 – Defining XML Data Formats
28
3.1 Document Type Definitions
Sometimes XML is too flexible:
• For exchanging data, the format (i.e., elements,
attributes and their semantics) must be fixed
⇒Document Type Definitions (DTD) for establishing the
vocabulary for one XML application (in some sense
comparable to schemas in databases)
A document is valid with respect to a DTD if it conforms
to the rules specified in that DTD.
Most XML parsers can be configured to validate.
29
DTD Example: Elements
<!ELEMENT article (title,author+,text)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT text (abstract,section*,literature?)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT section (#PCDATA|index)+>
<!ELEMENT literature (#PCDATA)>
<!ELEMENT index (#PCDATA)>
Content of the title element
is parsed character data
Content of the article element is a title element,
followed by one or more author elements,
followed by a text element
Content of the text element may
contain zero or more section
elements in this position
30
Element Declarations in DTDs
One element declaration for each element type:
<!ELEMENT element_name content_specification>
where content_specification can be
• (#PCDATA) parsed character data
• (child) one child element
• (c1,…,cn) a sequence of child elements c1…cn
• (c1|…|cn) one of the elements c1…cn
For each component c, possible counts can be specified:
– c exactly one such element
– c+ one or more
– c* zero or more
– c? zero or one
Plus arbitrary combinations using parenthesis:
<!ELEMENT f ((a|b)*,c+,(d|e))*>
31
More on Element Declarations
• Elements with mixed content:
<!ELEMENT text (#PCDATA|index|cite|glossary)*>
• Elements with empty content:
<!ELEMENT image EMPTY>
• Elements with arbitrary content (this is nothing for
production-level DTDs):
<!ELEMENT thesis ANY>
32
Attribute Declarations in DTDs
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
element name
attribute name
attribute type
attribute default
33
Attribute Declarations in DTDs
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
Possible attribute defaults:
• #REQUIRED is required in each element instance
• #IMPLIED is optional
• #FIXED default always has this default value
• default has this default value if the attribute is
omitted from the element instance
34
Attribute Types in DTDs
• CDATA string data
• (A1|…|An) enumeration of all possible values of the
attribute (each is XML name)
• ID unique XML name to identify the element
• IDREF refers to ID attribute of some other element
(„intra-document link“)
• IDREFS list of IDREF, separated by white space
• plus some more
35
Attribute Examples
<ATTLIST publication type (journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>
<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“ pubid=“XML98“>
<text>XML, the extended Markup Language, ...</text>
</publication>
</publications>
36
Attribute Examples
<ATTLIST publication type (journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>
<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“ pubid=“XML98“>
<text>XML, the extended Markup Language, ...</text>
</publication>
</publications>
37
Linking DTD and XML Docs
• Document Type Declaration in the XML document:
<!DOCTYPE article SYSTEM “http://www-dbs/article.dtd“>
keywords Root element URI for the DTD
38
Linking DTD and XML Docs
• Internal DTD:
<?xml version=“1.0“?>
<!DOCTYPE article [
<!ELEMENT article (title,author+,text)>
...
<!ELEMENT index (#PCDATA)>
]>
<article>
...
</article>
• Both ways can be mixed, internal DTD overwrites
external entity information:
<!DOCTYPE article SYSTEM „article.dtd“ [
<!ENTITY % pub_content (title+,author*,text)
]>
39
Flaws of DTDs
• No support for basic data types like integers, doubles,
dates, times, …
• No structured, self-definable data types
• No type derivation
• id/idref links are quite loose (target is not specified)
40
XML
Part 4 – Querying XML Data
4.1 XPath
4.2 XQuery
41
Querying XML with XPath and XQuery
XPath and XQuery are query languages for XML data, both
standardized by the W3C and supported by various database products.
Their search capabilities include
• logical conditions over element and attribute content
• regular expressions for pattern matching of element names
along paths or subtrees within XML data
+ joins, grouping, aggregation, transformation, etc. (XQuery only)
In contrast to database query languages like SQL an XML query
does not necessarily (need to) know a fixed structural schema
for the underlying data.
A query result is a set of qualifying nodes, paths, subtrees,
or subgraphs from the underyling data graph,
or a set of XML documents constructed from this raw result.
42
4.1 XPath
• XPath is a simple language to identify parts of the XML
document (for further processing)
• XPath operates on the tree representation of the
document
• Result of an XPath expression is a set of elements or
attributes
46
XPath by Example
/literature/book/author retrieves all book authors:
starting with the root, traverses the tree, matches element
names literature, book, author, and returns elements
<author>Suciu, Dan</author>,
<author>Abiteboul, Serge</author>, ...,
<author><firstname>Jeff</firstname>
<lastname>Ullman</lastname></author>
/literature/*/author authors of books, articles, essays, etc.
/literature//author authors that are descendants of literature
/literature//@year value of the year attribute of descendants of literature
/literature//author[firstname] authors that have a subelement firstname
/literature/(book|article)/author authors of books or articles
/literature/book[price < „50“]
/literature/book[author//country = „Germany“]
low priced books
books with German author
47
4.2 Core Concepts of XQuery
XQuery is an extremely powerful query language for XML data.
A query has the form of a so-called FLWR expression:
FOR $var1 IN expr1, $var2 IN expr2, ...
LET $var3 := expr3, $var4 := expr4, ...
WHERE condition
RETURN result-doc-construction
The FOR clause evaluates expressions (which may be XPath-style
path expressions) and binds the resulting elements to variables.
For a given binding each variable denotes exactly one element.
The LET clause binds entire sequences of elements to variables.
The WHERE clause evaluates a logical condition with each of
the possible variable bindings and selects those bindings that
satisfy the condition.
The RETURN clause constructs, from each of the variable bindings,
an XML result tree. This may involve grouping and aggregation
and even complete subqueries.
49
Thanks
Ad

More Related Content

What's hot (20)

Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Fazli Kabashi
 
Xml dom
Xml domXml dom
Xml dom
sana mateen
 
Xml
XmlXml
Xml
Dr. C.V. Suresh Babu
 
DOM and SAX
DOM and SAXDOM and SAX
DOM and SAX
Jussi Pohjolainen
 
Understanding XML DOM
Understanding XML DOMUnderstanding XML DOM
Understanding XML DOM
Om Vikram Thapa
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Abhra Basak
 
03 x files
03 x files03 x files
03 x files
Baskarkncet
 
00 introduction
00 introduction00 introduction
00 introduction
Baskarkncet
 
Xml Lecture Notes
Xml Lecture NotesXml Lecture Notes
Xml Lecture Notes
Santhiya Grace
 
XML Document Object Model (DOM)
XML Document Object Model (DOM)XML Document Object Model (DOM)
XML Document Object Model (DOM)
BOSS Webtech
 
Xml nisha dwivedi
Xml nisha dwivediXml nisha dwivedi
Xml nisha dwivedi
NIIT
 
01 xml document structure
01 xml document structure01 xml document structure
01 xml document structure
Baskarkncet
 
Extracting data from xml
Extracting data from xmlExtracting data from xml
Extracting data from xml
Kumar
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
Himanshu Soni
 
Querring xml with xpath
Querring xml with xpath Querring xml with xpath
Querring xml with xpath
Malintha Adikari
 
Publishing xml
Publishing xmlPublishing xml
Publishing xml
Kumar
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Prabu U
 
XML, DTD & XSD Overview
XML, DTD & XSD OverviewXML, DTD & XSD Overview
XML, DTD & XSD Overview
Pradeep Rapolu
 
Intro xml
Intro xmlIntro xml
Intro xml
sana mateen
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Kumar
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Abhra Basak
 
XML Document Object Model (DOM)
XML Document Object Model (DOM)XML Document Object Model (DOM)
XML Document Object Model (DOM)
BOSS Webtech
 
Xml nisha dwivedi
Xml nisha dwivediXml nisha dwivedi
Xml nisha dwivedi
NIIT
 
01 xml document structure
01 xml document structure01 xml document structure
01 xml document structure
Baskarkncet
 
Extracting data from xml
Extracting data from xmlExtracting data from xml
Extracting data from xml
Kumar
 
Xml and xml processor
Xml and xml processorXml and xml processor
Xml and xml processor
Himanshu Soni
 
Publishing xml
Publishing xmlPublishing xml
Publishing xml
Kumar
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Prabu U
 
XML, DTD & XSD Overview
XML, DTD & XSD OverviewXML, DTD & XSD Overview
XML, DTD & XSD Overview
Pradeep Rapolu
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
Kumar
 

Viewers also liked (20)

3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
SlideTeam.net
 
Evaluation Question 1 - final draft
Evaluation Question 1 - final draftEvaluation Question 1 - final draft
Evaluation Question 1 - final draft
troalfe
 
Exemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haoticExemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haotic
Diana Stănescu
 
Setting the group mode in quickmail
Setting the group mode in quickmailSetting the group mode in quickmail
Setting the group mode in quickmail
HKIEd Centre for Learning, Teaching & Technology
 
David costlow power point
David costlow power pointDavid costlow power point
David costlow power point
dc0615
 
Interactive Minds July 2009
Interactive Minds July 2009Interactive Minds July 2009
Interactive Minds July 2009
ABirkill
 
台灣新傳奇
台灣新傳奇台灣新傳奇
台灣新傳奇
teddy chang
 
Презентация бантиков
Презентация бантиковПрезентация бантиков
Презентация бантиков
Akella251
 
Ipsos MediaCT Syndicated Event Presentation Booklet
Ipsos MediaCT Syndicated Event Presentation BookletIpsos MediaCT Syndicated Event Presentation Booklet
Ipsos MediaCT Syndicated Event Presentation Booklet
Ipsos UK
 
智能家庭(消费电子)领域的机遇与挑战 威普咨询
智能家庭(消费电子)领域的机遇与挑战 威普咨询智能家庭(消费电子)领域的机遇与挑战 威普咨询
智能家庭(消费电子)领域的机遇与挑战 威普咨询
Neil Luo
 
Drupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part IDrupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part I
Tim Hamilton
 
Letters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. englishLetters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. english
HarunyahyaEnglish
 
Earn more money - build your personal brand online
Earn more money - build your personal brand onlineEarn more money - build your personal brand online
Earn more money - build your personal brand online
Katie McGregor
 
The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016
Ryan Bonnici
 
Award winning designers almost put their foot in it!!
Award winning designers almost put their foot in it!!Award winning designers almost put their foot in it!!
Award winning designers almost put their foot in it!!
Gary Potter
 
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannesTED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
Ogilvy
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
Philip Bourne
 
Video Marketing for Real Estate & Mortgage
Video Marketing for Real Estate & MortgageVideo Marketing for Real Estate & Mortgage
Video Marketing for Real Estate & Mortgage
Dave Woodson
 
Personal learning styles
Personal learning stylesPersonal learning styles
Personal learning styles
Matthew Ritter
 
Slide show jessie j
Slide show jessie jSlide show jessie j
Slide show jessie j
room24eps
 
3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
3 d pie chart circular puzzle with hole in center process 9 stages style 1 po...
SlideTeam.net
 
Evaluation Question 1 - final draft
Evaluation Question 1 - final draftEvaluation Question 1 - final draft
Evaluation Question 1 - final draft
troalfe
 
Exemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haoticExemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haotic
Diana Stănescu
 
David costlow power point
David costlow power pointDavid costlow power point
David costlow power point
dc0615
 
Interactive Minds July 2009
Interactive Minds July 2009Interactive Minds July 2009
Interactive Minds July 2009
ABirkill
 
Презентация бантиков
Презентация бантиковПрезентация бантиков
Презентация бантиков
Akella251
 
Ipsos MediaCT Syndicated Event Presentation Booklet
Ipsos MediaCT Syndicated Event Presentation BookletIpsos MediaCT Syndicated Event Presentation Booklet
Ipsos MediaCT Syndicated Event Presentation Booklet
Ipsos UK
 
智能家庭(消费电子)领域的机遇与挑战 威普咨询
智能家庭(消费电子)领域的机遇与挑战 威普咨询智能家庭(消费电子)领域的机遇与挑战 威普咨询
智能家庭(消费电子)领域的机遇与挑战 威普咨询
Neil Luo
 
Drupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part IDrupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part I
Tim Hamilton
 
Letters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. englishLetters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. english
HarunyahyaEnglish
 
Earn more money - build your personal brand online
Earn more money - build your personal brand onlineEarn more money - build your personal brand online
Earn more money - build your personal brand online
Katie McGregor
 
The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016
Ryan Bonnici
 
Award winning designers almost put their foot in it!!
Award winning designers almost put their foot in it!!Award winning designers almost put their foot in it!!
Award winning designers almost put their foot in it!!
Gary Potter
 
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannesTED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
Ogilvy
 
Video Marketing for Real Estate & Mortgage
Video Marketing for Real Estate & MortgageVideo Marketing for Real Estate & Mortgage
Video Marketing for Real Estate & Mortgage
Dave Woodson
 
Personal learning styles
Personal learning stylesPersonal learning styles
Personal learning styles
Matthew Ritter
 
Slide show jessie j
Slide show jessie jSlide show jessie j
Slide show jessie j
room24eps
 
Ad

Similar to XML for beginners (20)

Extended Markup Basic Introduction .ppt
Extended Markup Basic Introduction  .pptExtended Markup Basic Introduction  .ppt
Extended Markup Basic Introduction .ppt
lekhacce
 
Extended Markup Basic Introduction .ppt
Extended Markup Basic Introduction  .pptExtended Markup Basic Introduction  .ppt
Extended Markup Basic Introduction .ppt
lekhacce
 
Xml
XmlXml
Xml
Santosh Pandey
 
Decoding and developing the online finding aid
Decoding and developing the online finding aidDecoding and developing the online finding aid
Decoding and developing the online finding aid
kgerber
 
Html5
Html5 Html5
Html5
Shiva RamDam
 
23xml
23xml23xml
23xml
Adil Jafri
 
Xml
XmlXml
Xml
Anas Sa
 
Introduction to html
Introduction to htmlIntroduction to html
Introduction to html
Himanshu Pathak
 
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptxUnit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
VikasTuwar1
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
WebStackAcademy
 
SDP_-_Module_4.ppt
SDP_-_Module_4.pptSDP_-_Module_4.ppt
SDP_-_Module_4.ppt
ssuser568d77
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
BG Java EE Course
 
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudbweb page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
natiwoss2009
 
HTML Tags: HTML tags are the basic building blocks of any website. They are u...
HTML Tags: HTML tags are the basic building blocks of any website. They are u...HTML Tags: HTML tags are the basic building blocks of any website. They are u...
HTML Tags: HTML tags are the basic building blocks of any website. They are u...
BINJAD1
 
Introduction to Javascript
Introduction to JavascriptIntroduction to Javascript
Introduction to Javascript
Seble Nigussie
 
Kick start @ html5
Kick start @ html5Kick start @ html5
Kick start @ html5
Umesh Agarwal
 
Unit-III_JQuery.pptx engineering subject for third year students
Unit-III_JQuery.pptx engineering subject for third year studentsUnit-III_JQuery.pptx engineering subject for third year students
Unit-III_JQuery.pptx engineering subject for third year students
MARasheed3
 
HTML & CSS.ppt
HTML & CSS.pptHTML & CSS.ppt
HTML & CSS.ppt
vaseemshaik21
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
2022bcaaidsaman11164
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
STC-Philadelphia Metro Chapter
 
Extended Markup Basic Introduction .ppt
Extended Markup Basic Introduction  .pptExtended Markup Basic Introduction  .ppt
Extended Markup Basic Introduction .ppt
lekhacce
 
Extended Markup Basic Introduction .ppt
Extended Markup Basic Introduction  .pptExtended Markup Basic Introduction  .ppt
Extended Markup Basic Introduction .ppt
lekhacce
 
Decoding and developing the online finding aid
Decoding and developing the online finding aidDecoding and developing the online finding aid
Decoding and developing the online finding aid
kgerber
 
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptxUnit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
Unit-2_XMxvvxvxvxvLccccccccccccccccccccccccccc.pptx
VikasTuwar1
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
WebStackAcademy
 
SDP_-_Module_4.ppt
SDP_-_Module_4.pptSDP_-_Module_4.ppt
SDP_-_Module_4.ppt
ssuser568d77
 
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudbweb page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
web page.pptxb dvcdhgdhdbdvdhudvehsusvsudb
natiwoss2009
 
HTML Tags: HTML tags are the basic building blocks of any website. They are u...
HTML Tags: HTML tags are the basic building blocks of any website. They are u...HTML Tags: HTML tags are the basic building blocks of any website. They are u...
HTML Tags: HTML tags are the basic building blocks of any website. They are u...
BINJAD1
 
Introduction to Javascript
Introduction to JavascriptIntroduction to Javascript
Introduction to Javascript
Seble Nigussie
 
Unit-III_JQuery.pptx engineering subject for third year students
Unit-III_JQuery.pptx engineering subject for third year studentsUnit-III_JQuery.pptx engineering subject for third year students
Unit-III_JQuery.pptx engineering subject for third year students
MARasheed3
 
Ad

More from safysidhu (7)

Public banks in india
Public banks in indiaPublic banks in india
Public banks in india
safysidhu
 
Insurance regulatory and development authority of india (IRDA)
Insurance regulatory and development authority of india (IRDA)Insurance regulatory and development authority of india (IRDA)
Insurance regulatory and development authority of india (IRDA)
safysidhu
 
NREGA
NREGANREGA
NREGA
safysidhu
 
Corporate social responsibility
Corporate social responsibilityCorporate social responsibility
Corporate social responsibility
safysidhu
 
Various investment avenues
Various investment avenuesVarious investment avenues
Various investment avenues
safysidhu
 
National Income in India, Concept and Measurement
National Income in India, Concept and Measurement National Income in India, Concept and Measurement
National Income in India, Concept and Measurement
safysidhu
 
Security and exchange board of India
Security and exchange board of IndiaSecurity and exchange board of India
Security and exchange board of India
safysidhu
 
Public banks in india
Public banks in indiaPublic banks in india
Public banks in india
safysidhu
 
Insurance regulatory and development authority of india (IRDA)
Insurance regulatory and development authority of india (IRDA)Insurance regulatory and development authority of india (IRDA)
Insurance regulatory and development authority of india (IRDA)
safysidhu
 
Corporate social responsibility
Corporate social responsibilityCorporate social responsibility
Corporate social responsibility
safysidhu
 
Various investment avenues
Various investment avenuesVarious investment avenues
Various investment avenues
safysidhu
 
National Income in India, Concept and Measurement
National Income in India, Concept and Measurement National Income in India, Concept and Measurement
National Income in India, Concept and Measurement
safysidhu
 
Security and exchange board of India
Security and exchange board of IndiaSecurity and exchange board of India
Security and exchange board of India
safysidhu
 

Recently uploaded (20)

Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AIScaling GraphRAG:  Efficient Knowledge Retrieval for Enterprise AI
Scaling GraphRAG: Efficient Knowledge Retrieval for Enterprise AI
danshalev
 
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...
Egor Kaleynik
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Top 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docxTop 10 Client Portal Software Solutions for 2025.docx
Top 10 Client Portal Software Solutions for 2025.docx
Portli
 
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...
Eric D. Schabell
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
EASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License CodeEASEUS Partition Master Crack + License Code
EASEUS Partition Master Crack + License Code
aneelaramzan63
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 

XML for beginners

  • 1. XML
  • 2. 2 XML is not… • A replacement for HTML • A presentation format • A programming language • A network transfer protocol • A database
  • 3. 3 But then – what is it? XML is a markup language for text documents / textual data XML allows to define languages (“applications“) to represent text documents / textual data
  • 4. 4 XML by Example <article> <author>Gerhard Weikum</author> <title>The Web in 10 Years</title> </article> • Easy to understand for human users • Very expressive (semantics along with the data) • Well structured, easy to read and write from programs
  • 5. 5 XML by Example <t108> <x87>Gerhard Weikum</x87> <g10>The Web in 10 Years</g10> </t108> • Hard to understand for human users • Not expressive (no semantics along with the data) • Well structured, easy to read and write from programs
  • 6. 6 Possible Advantages of Using XML • Truly Portable Data • Easily readable by human users • Very expressive (semantics near data) • Very flexible and customizable (no finite tag set) • Easy to use from programs • Easy to convert into other representations (XML transformation languages) • Many additional standards and tools • Widely used and supported
  • 7. 7 App. Scenario 1: Content Mgt. Database with XML documents Clients ConvertersXML2HTML XML2WML XML2PDF
  • 8. 8 App. Scenario 2: Data Exchange Legacy System (e.g., SAP R/2) Legacy System (e.g., Cobol) XML Adapter XML Adapter XML (BMECat, ebXML, RosettaNet, BizTalk, …) SupplierBuyer Order
  • 9. 9 App. Scenario 3: XML for Metadata <rdf:RDF <rdf:Description rdf:about="http://www-dbs/Sch03.pdf"> <dc:title>A Framework for…</dc:title> <dc:creator>Ralf Schenkel</dc:creator> <dc:description>While there are...</dc:description> <dc:publisher>Saarland University</dc:publisher> <dc:subject>XML Indexing</dc:subject> <dc:rights>Copyright ...</dc:rights> <dc:type>Electronic Document</dc:type> <dc:format>text/pdf</dc:format> <dc:language>en</dc:language> </rdf:Description> </rdf:RDF>
  • 10. 10 App. Scenario 4: Document Markup <article> <section id=„1“ title=„Intro“> This article is about <index>XML</index>. </section> <section id=„2“ title=„Main Results“> <name>Weikum</name> <cite idref=„Weik01“/> shows the following theorem (see Section <ref idref=„1“/>) <theorem id=„theo:1“ source=„Weik01“> For any XML document x, ... </theorem> </section> <literature> <cite id=„Weik01“><author>Weikum</author></cite> </literature> </article>
  • 11. 11 App. Scenario 4: Document Markup • Document Markup adds structural and semantic information to documents, e.g. – Sections, Subsections, Theorems, … – Cross References – Literature Citations – Index Entries – Named Entities
  • 12. 12 XML Part 2 – Basic XML Concepts
  • 13. 13 XML Documents What‘s in an XML document? • Elements • Attributes • plus some other details
  • 14. 14 A Simple XML Document <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article>
  • 15. 15 A Simple XML Document <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article> Freely definable tags
  • 16. 16 Element Content of the Element (Subelements and/or Text) A Simple XML Document <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article> End Tag Start Tag
  • 17. 17 A Simple XML Document <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article> Attributes with name and value
  • 18. 18 Elements in XML Documents • (Freely definable) tags: article, title, author – with start tag: <article> etc. – and end tag: </article> etc. • Elements: <article> ... </article> • Elements have a name (article) and a content (...) • Elements may be nested. • Elements may be empty: <this_is_empty/> • Element content are strings with special characters, and/or nested elements (mixed content if both). • Each XML document has exactly one root element and forms a tree. • Elements with a common parent are ordered.
  • 19. 19 Elements vs. Attributes Elements may have attributes (in the start tag) that have a name and a value, e.g. <section number=“1“>. What is the difference between elements and attributes? • Only one attribute with a given name per element • Attributes have no structure, simply strings (while elements can have subelements) As a rule of thumb: • Content into elements • Metadata into attributes Example: <person born=“1912-06-23“ died=“1954-06-07“> Alan Turing</person> proved that…
  • 20. 20 XML Documents as Ordered Trees article author title text sectionabstract The index Web provides … title=“…“ number=“1“ In order … Gerhard Weikum The Web in 10 years
  • 21. 22 Well-Formed XML Documents A well-formed document must adher to, among others, the following rules: • Every start tag has a matching end tag. • Elements may nest, but must not overlap. • There must be exactly one root element. • Attribute values must be quoted. • An element may not have two attributes with the same name. • Comments and processing instructions may not appear inside tags.
  • 22. 24 Namespace <dbs:book xmlns:dbs=“http://www-dbs/dbs“> Unique URI to identify the namespace Signal that namespace definition happens Prefix as abbrevation of URI
  • 23. 25 Namespace Example <dbs:book xmlns:dbs=“http://www-dbs/dbs“> <dbs:description> ... </dbs:description> <dbs:text> <dbs:formula> <mathml:math xmlns:mathml=“http://www.w3.org/1998/Math/MathML“> ... </mathml:math> </dbs:formula> </dbs:text> </dbs:book>
  • 24. 26 Default Namespace • Default namespace may be set for an element and its content (but not its attributes): <book xmlns=“http://www-dbs/dbs“> <description>...</description> <book> • Can be overridden in the elements by specifying the namespace there (using prefix or default namespace)
  • 25. 27 XML Part 3 – Defining XML Data Formats
  • 26. 28 3.1 Document Type Definitions Sometimes XML is too flexible: • For exchanging data, the format (i.e., elements, attributes and their semantics) must be fixed ⇒Document Type Definitions (DTD) for establishing the vocabulary for one XML application (in some sense comparable to schemas in databases) A document is valid with respect to a DTD if it conforms to the rules specified in that DTD. Most XML parsers can be configured to validate.
  • 27. 29 DTD Example: Elements <!ELEMENT article (title,author+,text)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT text (abstract,section*,literature?)> <!ELEMENT abstract (#PCDATA)> <!ELEMENT section (#PCDATA|index)+> <!ELEMENT literature (#PCDATA)> <!ELEMENT index (#PCDATA)> Content of the title element is parsed character data Content of the article element is a title element, followed by one or more author elements, followed by a text element Content of the text element may contain zero or more section elements in this position
  • 28. 30 Element Declarations in DTDs One element declaration for each element type: <!ELEMENT element_name content_specification> where content_specification can be • (#PCDATA) parsed character data • (child) one child element • (c1,…,cn) a sequence of child elements c1…cn • (c1|…|cn) one of the elements c1…cn For each component c, possible counts can be specified: – c exactly one such element – c+ one or more – c* zero or more – c? zero or one Plus arbitrary combinations using parenthesis: <!ELEMENT f ((a|b)*,c+,(d|e))*>
  • 29. 31 More on Element Declarations • Elements with mixed content: <!ELEMENT text (#PCDATA|index|cite|glossary)*> • Elements with empty content: <!ELEMENT image EMPTY> • Elements with arbitrary content (this is nothing for production-level DTDs): <!ELEMENT thesis ANY>
  • 30. 32 Attribute Declarations in DTDs Attributes are declared per element: <!ATTLIST section number CDATA #REQUIRED title CDATA #REQUIRED> declares two required attributes for element section. element name attribute name attribute type attribute default
  • 31. 33 Attribute Declarations in DTDs Attributes are declared per element: <!ATTLIST section number CDATA #REQUIRED title CDATA #REQUIRED> declares two required attributes for element section. Possible attribute defaults: • #REQUIRED is required in each element instance • #IMPLIED is optional • #FIXED default always has this default value • default has this default value if the attribute is omitted from the element instance
  • 32. 34 Attribute Types in DTDs • CDATA string data • (A1|…|An) enumeration of all possible values of the attribute (each is XML name) • ID unique XML name to identify the element • IDREF refers to ID attribute of some other element („intra-document link“) • IDREFS list of IDREF, separated by white space • plus some more
  • 33. 35 Attribute Examples <ATTLIST publication type (journal|inproceedings) #REQUIRED pubid ID #REQUIRED> <ATTLIST cite cid IDREF #REQUIRED> <ATTLIST citation ref IDREF #IMPLIED cid ID #REQUIRED> <publications> <publication type=“journal“ pubid=“Weikum01“> <author>Gerhard Weikum</author> <text>In the Web of 2010, XML <cite cid=„12“/>...</text> <citation cid=„12“ ref=„XML98“/> <citation cid=„15“>...</citation> </publication> <publication type=“inproceedings“ pubid=“XML98“> <text>XML, the extended Markup Language, ...</text> </publication> </publications>
  • 34. 36 Attribute Examples <ATTLIST publication type (journal|inproceedings) #REQUIRED pubid ID #REQUIRED> <ATTLIST cite cid IDREF #REQUIRED> <ATTLIST citation ref IDREF #IMPLIED cid ID #REQUIRED> <publications> <publication type=“journal“ pubid=“Weikum01“> <author>Gerhard Weikum</author> <text>In the Web of 2010, XML <cite cid=„12“/>...</text> <citation cid=„12“ ref=„XML98“/> <citation cid=„15“>...</citation> </publication> <publication type=“inproceedings“ pubid=“XML98“> <text>XML, the extended Markup Language, ...</text> </publication> </publications>
  • 35. 37 Linking DTD and XML Docs • Document Type Declaration in the XML document: <!DOCTYPE article SYSTEM “http://www-dbs/article.dtd“> keywords Root element URI for the DTD
  • 36. 38 Linking DTD and XML Docs • Internal DTD: <?xml version=“1.0“?> <!DOCTYPE article [ <!ELEMENT article (title,author+,text)> ... <!ELEMENT index (#PCDATA)> ]> <article> ... </article> • Both ways can be mixed, internal DTD overwrites external entity information: <!DOCTYPE article SYSTEM „article.dtd“ [ <!ENTITY % pub_content (title+,author*,text) ]>
  • 37. 39 Flaws of DTDs • No support for basic data types like integers, doubles, dates, times, … • No structured, self-definable data types • No type derivation • id/idref links are quite loose (target is not specified)
  • 38. 40 XML Part 4 – Querying XML Data 4.1 XPath 4.2 XQuery
  • 39. 41 Querying XML with XPath and XQuery XPath and XQuery are query languages for XML data, both standardized by the W3C and supported by various database products. Their search capabilities include • logical conditions over element and attribute content • regular expressions for pattern matching of element names along paths or subtrees within XML data + joins, grouping, aggregation, transformation, etc. (XQuery only) In contrast to database query languages like SQL an XML query does not necessarily (need to) know a fixed structural schema for the underlying data. A query result is a set of qualifying nodes, paths, subtrees, or subgraphs from the underyling data graph, or a set of XML documents constructed from this raw result.
  • 40. 42 4.1 XPath • XPath is a simple language to identify parts of the XML document (for further processing) • XPath operates on the tree representation of the document • Result of an XPath expression is a set of elements or attributes
  • 41. 46 XPath by Example /literature/book/author retrieves all book authors: starting with the root, traverses the tree, matches element names literature, book, author, and returns elements <author>Suciu, Dan</author>, <author>Abiteboul, Serge</author>, ..., <author><firstname>Jeff</firstname> <lastname>Ullman</lastname></author> /literature/*/author authors of books, articles, essays, etc. /literature//author authors that are descendants of literature /literature//@year value of the year attribute of descendants of literature /literature//author[firstname] authors that have a subelement firstname /literature/(book|article)/author authors of books or articles /literature/book[price < „50“] /literature/book[author//country = „Germany“] low priced books books with German author
  • 42. 47 4.2 Core Concepts of XQuery XQuery is an extremely powerful query language for XML data. A query has the form of a so-called FLWR expression: FOR $var1 IN expr1, $var2 IN expr2, ... LET $var3 := expr3, $var4 := expr4, ... WHERE condition RETURN result-doc-construction The FOR clause evaluates expressions (which may be XPath-style path expressions) and binds the resulting elements to variables. For a given binding each variable denotes exactly one element. The LET clause binds entire sequences of elements to variables. The WHERE clause evaluates a logical condition with each of the possible variable bindings and selects those bindings that satisfy the condition. The RETURN clause constructs, from each of the variable bindings, an XML result tree. This may involve grouping and aggregation and even complete subqueries.

Editor's Notes

  • #42: W3C: World Wide Web Consoritium