DMC1951
DMC1951
NOTES
UNIT I
XML INTRODUCTION
1.1 INTRODUCTION TO XML
A markup language describes the form of a document i.e. they explain how the
document should be processed or interpreted. HTML may flash in your memory immediately
after reading the term markup language. It is no wonder that HTML is the most “popular”
markup language. You may clearly find out the differences between XML and HTML in
later portions of this text.
You may be aware of the fact that there exist wide varieties of markup languages. To
name a few: HTML (Hyper Text Markup Language), SGML (Structured Generalized
Markup Language), VRML (Virtual Reality Markup Language) etc. Every markup language
has its own specialized purpose. HTML, for example is the de facto standard for designing
web pages. VRML is used in the virtual reality domain.
The forerunner of all these markup languages is SGML. SGML is called the mother
of all markup languages. All other markup languages are the derivates of the SGML in one
way or another. So is XML. XML is undoubtedly one among the most used markup
languages in today’s scenario.
1.1.2 What XML is?
XML stands for eXtensible Markup Language. The name itself makes you realize
that it is a markup language and it is extensible in nature. XML is a World Wide Web
consortium (W3C) specification. XML was released during 1998. But the official XML
1.0 recommendation was released on 6th October 2000.
Any technology to become the official W3C standard it has go through the stages
like notes, working draft, candidate recommendations, recommendations. The complete
explanation about each stage is outside the scope of this text. XML has gone through all
these stages and now it has become the official W3C recommendation. If you would like
to know these stages in more details surf to http://www.w3.org. The formal XML
recommendation can be accessed at http://www.w3.org/TR/REC-xml.
1 ANNA UNIVERSITY CHENNAI
DMC 1801
XML is playing a key role almost all the technologies like J2EE, .NET etc. In this you
will learn about various roles that XML is playing in today’s Information Technology world.
Biggest advantage with XML is it is both powerful and simple. XML is plain text in nature.
The below list is a set of roles that XML plays:
Structured Data Representation
Separating Data from User Interface
Standard for Transfer of data
Rule based data representation
Providing platform independence
Customized display of data
The primary role of XML is to represent the data in a structured way. Normally for
the purpose of storing data in a structured way you would use databases. But the
introduction of database adds complexities like you may need a Data Base Management
System(DBMS) and binding with a particular vendor’s product. The other ways of
representations are CSV(comma separated values) or TSV (tab separated values).
Ram, 10, 25
Raj, 15, 30
John, 12, 40
From the above representation in Figure 1 you may not get any further information. If
the same example is represented using XML then it may look like as shown in figure 2.
From this listing you can gather information like these data is about a group of students and
it contains information regarding their name, age and weight. So XML represents data in a
structured way from which we can easily extract certain information from the first look
itself.
<group> NOTES
<student>
<name> Ram </name>
<age> 10 </age>
<weight> 25 </weight>
</student>
<student>
<name> Raj </name>
<age> 15 </age>
<weight> 30 </weight>
</student>
<student>
<name> John </name>
<age> 12 </age>
<weight> 40 </weight>
</student>
</group>
The idea here is not to derive that xml would replace databases but to emphasize that
XML has the capability to represent the data in a meaningful way. Otherwise the domains
of database and xml don’t overlap too much.
Since XML is simple text the size of this representation is lesser comparing to other
proprietary binary representation.
XML tag has built in functionality to do anything. You would be attaching separate section
NOTES for specifying the rendering logic.
For example you can associate XSLT (XML Style Language Transformation) with
XML to render the contents of XML in a way you want. You could find more about XSLT
in later sections. If you have some idea about CSS (Cascading Style Sheets), you would
feel at home with XSLT.
This idea of separating content from formatting gives us the following advantages.
XML Data
XML is an open standard i.e. it is not proprietary. This makes XML operates above
all the application boundaries. For example, if you have two applications, one developed
using Microsoft .Net and another with J2EE then nothing blocks you to transfer data from
.NET application to J2EE and vice versa. How this becomes possible is because of the
fact that XML is an open standard.
J2EE application
NOTES
X
M
L
Microsoft .NET
application
Figure: XML data transfer between two applications with different technologies
This scenario is depicted in above figure. You can see that the arrow between two
applications is double sided i.e. both the applications can transfer and send data to and
from another.
Another huge advantage of transferring data with XML is that it is purely text. The
benefit of XML being text is that it would not be blocked by the firewalls. If you transfer
data from one network to another in a binary format there is fair enough chance that it may
be blocked by firewall.
XML will not be blocked by firewalls because of the fact that it is a text standard.
Being a text XML has no executable code in it. So it can not carry any self executable
code like viruses. This makes XML the desirable standard for transfer of data from one
network to another. This scenario is explained in above Figure.
F
i
Data in proprietary format r
e
w
Network A Network B
Data in XML Format a Allowed
l
l
and it allows text transfer. In the case of proprietary formats it checks for executable
NOTES malicious code and if any such things are found then the transfer would be blocked.
In the Internet dominated world of today security plays a vital role. The binary
representations of data shall carry malicious code in it. This may led to some serious
problems in the internet scenario because of the huge number of senders and receivers in
it. Internet can not assure the genuineness of the parties’ transferring data. In these
circumstances there is desperate need for a standard which has the capabilities of faster
and secure transfer of data. XML has the both these capabilities. So it has become mostly
used standard for transfer of data in internet.
A question may arise in your mind at this point regarding how XML transfer is faster.
The answer lies in the fact that it doesn’t contain any unnecessary content other than the
actual data and the tags around it. So the size of the file becomes very small comparing to
other proprietary binary standards which may contain additional information to carry out
certain operations. The power of XML lies in this simple nature.
Though the XML data is simply text you can associate certain rules with its
representation. Previous section explained you that XML data is simply text. In the real
world scenario you may need to associate certain rules with the data. To make it clearer
you can specify what should be format of the data.
For example, you are representing employee data it can be made sure that it should
contain elements like employee id, name, department and salary. By making definitions
with DTD (Document Type Definitions) you can assure the elements of the data. If the
supplied data doesn’t adhere to the standards given by document type definition it would
raise errors.
This facility of specifying the structure information with a XML file is an important
feature of XML. Let us imagine that if such feature is not there any body can enter any data
and it becomes highly impossible to achieve synchronization between applications. As the
major purpose of XML is data transfer between different applications it becomes ultimately
necessary to have certain rules governing the data representations.
The key point to understand here is that these rules are not imposed as the general
conditions for all the xml files. In fact these rules are formulated by the system designers for
each file individually. The reason for not having the generalized conditions is that power of
xml lies in extensibility. So if hard rules regarding the data format are introduced then there
is threat to this extensibility feature.
The application of DTD makes an XML file to follow certain rules which are specific
to that file alone. This facilitates the integrity with respect to format of the data. So DTD is
a corner-stone in the world of xml application development. Another point worth mentioning
here is that this DTD requirement is not mandatory. You can have a xml file which doesn’t NOTES
has DTD specification associated with it. So the control is in the hands of the designer who
can decide whether to attach the DTD with a XML file or not.
In fact it is not only the application development technologies that xml can achieve
independence but it also supports cross platform data transfer.
For example let us imagine two applications one running on Linux operating system
and another on Mac operating system. XML can be used to perform effective data transfer
between applications running on entirely different operating systems. This is depicted in the
following Figure.
Application running on
MAC OS
X
M
L
Application running on
Linux
The advantage that you can achieve with that is the seamless integration applications
running on different operating systems. The internet technologies are also cross platform.
So XML becomes a friendly tool to co-exist with these web technologies.
You can have a question now that how this becomes possible. The answer lies in the
fact there exists parsers available for XML in all the major operating systems. You will
learn more about xml parsers in later sections.
For example consider a scenario where you have data about hundred students. You
have that data in XML format. Now you can display this data in tabular format or any
other custom format that you wish. How to achieve this will be answered in later portions
of this text. For the moment you can have the understanding that customized display of
same data is possible in XML.
The technique that enables us to have this customized display is XSLT. Using XSLT
you can achieve the customized display of same data in different format. In section 1.3.2
you learned that the same xml file can be formatted to look differently for a personal
computer, a mobile device and for a printer output.
The way to achieve this is by attaching different XSLT files with the same xml file. The
moment you change the XSLT file the display format is also modified. The reason for this
is XML tags by itself don’t have any display logic in them. So it is the XSLT that changes
the look and feel of the xml content.
Now you can get a question that what happens if I don’t attach any XSLT files with
xml. The answer is browser dependent. But most of the browsers display the xml file in a
tree structure which you can fold and unfold. In the case of Mozilla Firefox it clearly says
that it is displaying the content with default style specification if you don’t attach any specific
style information.
This customization feature of XML has been one of the important reasons for success
of xml. If you have some ideas in HTML and CSS, you clearly understand that the relation
between XML and XSLT is same as the relation between HTML and CSS.
World Wide Web has become the biggest collection of information. You can find
information about anything ranging from child’s toy to nanotechnology in web. Web has
become such a big repository of information. The size of the web is increasing in exponential
proportions. The latest technologies in the world of web enable anyone to post information
on to the web. This causes the phenomenon of web explosion. Reading all through the
above line you would have got the feel about massive size of the web.
The massive size of the web is both an advantage as well as disadvantage for web.
On the advantage side you may say that all kinds of information are available in web. On
the disadvantage side you can quote the problem retrieving the relevant information and
organizing such a massive collection.
You would have observed that the above two paragraphs doesn’t consist of the word
XML. You can have a question that where XML fits in to this scene. The objective of this NOTES
section is to provide answer to this interesting question.
Hyper Text Markup Language is the most used markup language on World Wide
Web. Let us explore the similarities and differences between HTML and XML in this
section.
The primary role of HTML is rendering of web content. All the browsers in the world
understand HTML. HTML is a markup language which consists of a collection tags whose
behavior is predefined i.e. each and every tag in HTML has an in-built meaning associated
with it. The browser would interpret it and render the output. Of course there would subtle
differences between various browsers in displaying the HTML content. The underlying
fact is that all the browsers can understand HTML and render the output
If you compare HTML and XML with respect to display behavior, later doesn’t have
tags which have predefined meaning with them. The tags in XML are defined by the users.
So they won’t have any display functionality associated with them. Their main purpose is
to organize the data rather than displaying the data in a display format. This becomes the
major difference between XML and HTML in the view point of displaying the tags in web
browsers.
Another important difference between XML and HTML lies in the strictness of
following rules. You can easily say that rules in HTML are not strict i.e. there exist certain
rules which can be followed or neglected. For example, if you think of closing tags, it is not
mandatory in HTML where as it is compulsory in XML.
<form> <name>
<first> Ram </first>
<b> Enter your name here
<input type = text> <last> Kumar
</name>
</form>
If you look at the above Figure, HTML code snippet is given on one side and
XML snippet is given on another side. The HTML snippet has tags like <b>, <input>
which are not closed. Though the closing tags are missing HTML code snippet is a valid.
No browser would throw an error message for this HTML snippet. At the same time if you
look at the XML snippet you can observe that closing tag for <last> is missing. Here only
one closing tag is missing. But XML will not accept this. Closing tags are compulsory with
respect to XML. There is a concept called well-formed XML which requires many criteria
NOTES to be satisfied.
Another important difference with HTML and XML is the nesting of tags. HTML
doesn’t impose hard rules on nesting of tags. XML is very strict on nesting of tags. The tag
that is opened last should be closed first. This rule can not be violated in XML. But HTML
is lenient regarding this rule.
<form> <name>
<first> Ram </first>
<b> Enter your name here
<input type = text> <last> Kumar
</name>
</form>
</b> </last>
If you observe the figure, it compares HTML and XML with respect to nesting of
tags. In HTML snippet the <b> tag is closed after the </form> tag which is not proper
nesting because <form> tag has been given before <b> itself. So the <b> tag has to be
closed before <form> tag. But this rule is not followed here. But HTML doesn’t throw an
error message for this. At the same time, if you look at the xml snippet the <last> tag is not
closed following nesting rules. In this scenario XML will not accept this as a valid XML
snippet. So there would be problem during parsing of this XML code snippet.
The final difference between HTML and XML is the case-sensitiveness. The former
is not case-sensitive but the later is. HTML tags are differentiated with respect to case. But
in xml two tags with same letters but with different case are not the same. They would be
considered two different tags. This is depicted in Figure.
<form> <name>
<first> Ram </first>
<b> Enter your name here
<input type = text> <last> Kumar </LAST>
</name>
</FORM>
</b>
In the above figure, HTML has tags where opening is given in one case and closing in
another case. For example the closing tag for <form> is given as </FORM> i.e. in
upper case. Here nothing would go wrong with respect to HTML. If you look at XML
code snippet, the tags <last> and </LAST> would not be considered as a pair. The reason
for this to happen is the case sensitiveness of the XML tags. NOTES
So in general, you can reach a conclusion that HTML is a markup language where
you would find hard and fast rules. But in the case of XML if there is a rule for something
it can not neglected for any reason. On the other hand, XML is flexible with respect to
creation of new tags i.e. you can create your own tags out of blue where as in HTML this
is not possible. HTML has a predefined set of tags with which the rules are not strict. XML
has infinite set of tags with which rules are compulsory.
Such a lengthier discussion of comparison between XML and HTML has become
necessary because you must become very clear with respect to the features of XML and
HTML. The reason for emphasizing this fact is that you should study XML in a HTML
viewpoint. Both these tools are different and their functionalities also differ.
The only similarity between XML and HTML is that both of these are the derivatives
of SGML (Standard Generalized Markup Language). This SGML is the mother of all
markup languages.
In World Wide Web, different parties work together to achieve certain goals. If there
NOTES is no technology which is above all the differences then the power of web would become
a question mark. An example scenario is depicted in figure.
Distributor
XML
XML
XML
Manufacturer 1 Manufacturer 2
with the help of XML adds an entirely new dimension to the World Wide Web. This XML
enriched semantic web is becoming one of the promising trends in the world of web. NOTES
Search engines can retrieve contents which are more relevant in semantic web than in the
normal web.
The capability of all the browsers in handling XML is another big advantage for it.
There would be slight variation between them but at the core level they all support XML.
If you go through any XML book that was published four or five years before, there you
would find a sentence saying “XML is the next big thing in the world wide web”. Now time
has come to strikeout the word “next” in those sentences because already it has become
the big thing in web.
Based on the importance factor, you can compare the role of XML in web with the
role ofASCII (American Standard Code for Information Interchange) in desktop paradigm.
In desktop paradigm ASCII was playing vital role being the common representation standard
across all the applications. Now XML has taken that role in Web. With the added diversities
of Web, XML is doing a similar thing which ASCII was doing for desktop paradigm. By
reading the above line you should not reach a conclusion that ASCII and XML are related
technologies. The comparison between those technologies has been given in the prospective
of critical roles they play in those paradigms.
So we can conclude that XML role on web is multi-faceted. On one hand it acts as
the glue technology in bridging application running on web. On the other hand it enriches
the search capabilities of contents on web. In another dimension the extensionality nature
of XML suits most to the World Wide Web.
QUESTIONS
Part A
3. XSLT refers to
NOTES a. eXtended Secure Language Technique
b. eXclusive Style Language Tool
c. XML Style Language Transformation
d. None of the above
4. DTD refers to
a. Document Tool Design
b. Data Tool Design
c. Document Type Definition
d. None of the above
5. XML is
a. Not case sensitive
b. Less strict than HTML
c. Nor related to Web
d. None of the above
Answers
1. b 2. a 3. c 4. c 5. d
Part B
Short Questions
6. List out the roles of XML.
7. 7.How XML provides platform independence
8. List out the needs of XML in Web.
9. How XML separates data from user interface
Part C
Descriptive Type Questions
10. Compare and Contrast XML and HTML.1.2 XML Basics
1.2 XML BASICS
1.2.1 Introduction
In this Chapter you would learn about XML syntax basics. After going through this
chapter you would be able to create your own XML files. In addition to this you would
also have an introductory idea about various nomenclature used in XML. The structure of
this chapter has been arranged in a step by step explanatory manner so that you will learn
one thing at a time and towards the end of it you would have collective knowledge about
those topics.
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
</player>
</team>
NOTES
In the previous section you learned about how to create a simple XML file. After
reading this section you would become familiar with each every component i.e. the anatomy
of a XML file. You would find a scan of XML file starting right up from the first line to the
end of file.
This line is called XML declaration. It describes important attribute of the file. The
line begins with <?xml. This indicates that the following file is XML file. Immediately following
this you have “version” attribute. Here you can find the value 1.0 is the above example.
This attribute tells that which version of XML that you are using. The current possible
values for this attribute are 1.0 and 1.1. The real meaning of these attribute is that how they
would be parsed by the browsers or applications. If you want to on the safer side go for
the version 1.0 because it is supported in almost all the popular browsers and applications.
Another point to note down here is that the version attribute is optional i.e. you can even
omit this version attribute. Another interesting thing mentioning here is that in the initial
drafts of XML they used <?XML ?> but later it was modified to the lowercase <?xml?>.
Now you have to use only the later one i.e. the lowercase.
The next attribute is “encoding”. It refers to the “character set” that would be used to
represent your file. The default attribute in many windows based editors is ASCII i.e. NOTES
American Standard Code for Information Interchange. ASCII has the capability to represent
only the text documents, to be precise only the pure text documents. What does that
means is these files can have only pure text content like A to Z, a to z, 0-9 etc. Total
number of symbols possible in ASCII is 256. For example if you say “A” the ASCII value
is 65, for B it is 66 etc. The drawback with using ASCII is its inability to express many
other languages other than English like Chinese, Hindi etc. The World Wide Web is not
only for English. It supports many other languages. In the previous Chapter you would
have read that XML breaks all the barriers of technology, platform etc. So it can not
restrict with only one human Language i.e. English. XML has to support all other languages.
The solution to the above problem is to move towards a character set which supports
more number of characters than ASCII preferably the characters of many of the human
languages. One such character set is Unicode. Unicode would support 65,536 (216)
characters in total. Unicode is 2 Byte long. There is another character code which is called
Universal Character System (UCS) which supports almost 2 billion symbols. The UTF-8
that you have seen for the encoding attribute is “UCS Transformation Format-8 (UTF-
8)”. The specialty of UTF-8 is that it uses a mixture of one byte and two bytes. For symbol
that can be represented with one byte itself it uses only one byte for example alphabets like
a to z. For other symbols which are not in the boundary of one byte it goes for two bytes
per symbols.
There is an another format called UTF-16 where the lowest count itself it two bytes
and for less commonly used symbols it uses more than two bytes. The point to note down
here is that you can use the values UTF-8 and UTF-16 for the encoding attribute. Like we
specified for version attribute UTF-8 is supported by all the XML processors. So if you
don’t have requirements to use symbols from other languages you can go for UTF-8
which more space conscious thing to do. The default value for the encoding attribute is
UTF-8 i.e. even you omit the encoding attribute the value UTF-8 would be considered
automatically.
There is one more attribute which you can use here. That attribute is called “standalone”
attribute. The example is given below:
<?xml version = “1.0” standalone=”yes” encoding=”UTF-8"?>
The purpose of using standalone attribute is that, it indicates whether this XML file is
complete by itself or it needs support from other files. If it doesn’t require support from
other files you can use values of this attribute as “Yes”. If it requires support from other files
you can use the value “No”.
In general this XML declaration line comes under the category called XML prolog.
There exist various other parts of XML prolog. You can find more information regarding
XML prolog in later sections.
17 ANNA UNIVERSITY CHENNAI
DMC 1801
In our example code you can find a comment line right up in the second line itself.
This comment is similar to the comments that you would have used with HTML. The
XML comment would begin with the symbols <!— . Then you can have the actual comment
and it has to end with —>.
Though you insert comments according to your wish there are certain conditions
which are to be followed while placing comments. One such constraint is that you can not
have a comment as break in the tag. For example
is an invalid comment. The tag <name> has been broken and in between the tag comment
has been used. XML processors would not accept such kinds of comments.
According to XML 1.0 specification, placing a comment before the XML declaration
is invalid. So the first line of a XML file should be XML declaration. After that only you can
use either the XML tags or comments.
XML tags are the basic element of a XML file. XML tags are similar to HTML tags.
Any tag in XML has to start with the symbol <. Then you would have the tag name. There
are certain rules for XML tag names. They are as explained below
XML tag name can contain alphabets, numerals and special characters. Any XML
tag can not start with a number or punctuation. XML tag names can not hold a space in
them. Another important thing is the XML tag names can not begin with the term XML.
The following table lists various xml tags and indicate whether they are valid or invalid.
It also provides you the reasons for which these tags are considered either as valid or
invalid.
Apart from following strict rules there are certain best practices which would make
your xml listing more professional. For example you can avoid using ‘.’ in xml
because ‘.’ is reserved for some other purposes in many programming languages.
Similarly you can avoid using ‘:’ is your xml tag names because they tend to create some
misinterpretation among the readers. If you follow these types of ethics in your tag names
they would definitely increase the readability of your xml file.
Immediately following tag name there is > symbol. Similar to HTML any starting tag
would have an ending tag. The syntax for end tag is similar to the start tag except the
inclusion of / symbol. For example the valid closing tag for <player_name> is </
player_name>.
Having understood about XML tags the next step for you is to understand XML
elements. Actually XML tags are part of an XML element. To make things much clear the
components of an XML element is as given below:
For example
Where
Dhoni = Text
For certain elements there may not be any text at all. These kinds of elements are
called Empty elements. For example
<lastname />
Empty elements doest not have any closing tag. Instead of this you can close the tag
NOTES there itself by leaving a space and putting a ‘/’ before ‘>’. This is shown in the above
example.
You are already aware of the rules about XML tags. Now you may have question
whether there are any rules for the text that you place in between tags? In a broader
perspective the answer is ‘No’. Note down the term ‘broader’. You may place anything
as XML text as you want.
In the above example you can find the numbers 1 and 0 in between the <name> tags.
Here no restriction to put only names. XML provides you such a freedom. It is only in your
hand to supply the necessary data as XML text.
Another point to note down is no restriction on the length of XML text. The text can be of
any length as you wish. XML doesn’t specify any theoretical limit on the length of the text.
If you insert any white space in between the text the white space is preserved by
XML. It is the responsibility of the target application to keep or reject the white spaces.
For example Microsoft Internet Explorer strips all the white spaces and displays the output
without them. The following example illustrates the same.
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin Tendulkar</name>
<age> 35 </age>
</player>
</team>
NOTES
In the sample file above look at the name “Sachin Tendulkar” with many spaces
between two words. If you look at the output it doesn’t contain the spaces. These spaces
are automatically stripped out in the output.
Though XML is flexible in having any thing as text, there is a rule that if you place any
characters that have special meaning in XML like ‘<’, then it will cause errors.
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin Tendulkar</name>
<age> 35 </age>
</player>
</team>
NOTES
Internet Explorer version 6.0 has shown an error because of the symbol < in the
<average> element.
In XML text it is always better to go for the entity equivalents for special symbols.
Table shows the symbol and its corresponding entity equivalent.
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin Tendulkar</name>
<age> 35 </age>
</player>
</team>
This method of replacing symbols with entity equivalents is good for this kind of
scenario. Let us imagine a condition where you have a portion of C program for display
purpose then it would become tedious to replace all the special symbols by equivalent
entities. It would also be a problem to edit the contents later on. So the effective solution
for this kind of scenario is to use the CDATA section. The following example shows an
efficient usage of CDATA section.
The above XML code contains a portion of C code as text. Imagine you have to
replace all the special symbols with entity equivalents. In order to avoid this you can use
CDATA section. Look at the CDATA section syntax. The end of the CDATA section
would have the sequence of ‘]]’ symbols.
You have already learned in the previous sections that rules in XML are strict in
nature i.e. they are mandatory. So any XML file has to follow these rules. An XML which
is formed by following these rules is called well-formed. This section summarizes all the
rules for an XML file to be called well-formed.
The presence of a root element is mandatory for an XML file to be called well-
formed. If you see the previous example in this chapter you would find the root element as
<team>. All the other elements of the file are placed inside this root element. The root
element has to be opened first and it is the root element which is closed at last.
You already know that tags can be nested in XML. These nesting should be proper.
The tag that is opened last should be the tag to be closed first. Again you can refer the
previous example in this chapter. There you can find out that the tag <team> has been
opened first and it is the same <team> tag which is closed last. Not only this tag but any tag
in XML has to follow this nesting rule.
If you use any attribute in your XML file then the value for those attributes should be
given in quotes. Look at the following example.
The name tag now has an attribute called “role”. The value for this role attribute
should be given in quotes for this XML to be well-formed. You can recall the fact that in
HTML this rule of quotes is not followed strictly whereas in XML it is followed strictly.
In the above paragraph, you learned the necessity of closing tag in XML. There
you came across a term called empty element. An element is called empty element if it
doesn’t posses any text between its starting tag and ending tag. Look at the following
example.
<player>
<name> Rahul </name>
<age> 33 </age>
<wickets />
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
<wickets> 153 </wickets>
</player>
</team>
Figure : XML with properly managed empty tags
In the previous example if you look at first two <wickets> tag it doesn’t has a value
associated with it. So it has no closing tag. But note the presence of “/” before the symbol
“>”. You would have noticed a space between the word “wickets” and “/”. This space is
mandatory in case of XHTML (more on XHTML later) to cope up with the older web
browsers. But this space is not mandatory with XML because of the fact that XML
processors can recognize the empty elements even without this space in between.
Application
XML
1
Ms Oracle
Access
DB
Here you are replacing the DBMS tools like MS Access and Oracle with XML but
NOTES you can use XML in Combination with these to achieve the interoperability between various
applications. So the point to remember here is that XML is not DBMS.
From the above discussions you would have understood the basics of XML. This
section projects three important revolutions of XML i.e. the changes effected by XML in
various dimensions. The following list gives you various XML dimensions.
1. Data Revolutions
2. Architecture Revolution
3. Software Revolution
Prior to XML, data was considered to be application specific. The data associated
with an application was in proprietary format of application itself. The primary problem
with this kind of approach is that data becomes locked with in a particular application. If at
all you want data to be communicated it has to be sent as parameters to functions which
are again application specific. This is depicted in the following figure.
The top portion of the figure indicates how data was communicated prior to XML.
The lower half indicates how XML modified data from parameters to Documents. These
documents would be sent across the web. Each application both sending and receiving
would have capabilities to understand and parse these documents. After parsing the
documents data would be extracted. The primary advantage that we achieve through this
is the application neutral data. This increase the easy of data transfer between applications.
In addition to data revolution XML has provided a drastic paradigm shift in the manner
in which application are architected. Prior to XML applications were tightly coupled i.e.
any change in one application may require one or more changes in other applications. This
is illustrated in the following figure. NOTES
XML has provided the loosely coupled applications. The advantage here is that you
can become free of vendor binding and technology binding. You can choose technologies
from various vendors suited to specific components and yet achieve the interoperability
among these applications using XML.
Application 2
Application 1 Application 3
Web
Application 3
XML has made a huge impact on how applications are developed. Prior to XML
Software development would be strictly in accordance with well-described requirement
specifications. The problem with this approach is that if you like make modification based
on the real time requirements it would be very difficult to carry out. With the introduction of
XML software development has become a collaborative process.
Now the designer has to assemble various components based on the present
requirements. Whenever there is a change in requirements the corresponding components
can be introduced in to the assembly.
29 ANNA UNIVERSITY CHENNAI
DMC 1801
This approach of XML makes the software to be flexible in nature. Another advantage
NOTES is that you can select the existing components which are well tested and yet extensible for
your application.
The above specified data, architecture, software revolutions of XML has really created
new ways of application development. These three dimensional impact of XML on software
development has provided a stronger space for XML in the Information technology industry.
Questions
Part A
Objective Type Questions
1. XML can be edited in
a. vi editor
b. Notepad
c. GEdit
d. All of the above
2. UCS refers to
a. Universal Character System
b. Unicode Compatible System
c. Useful Characters Sample
d. None of the above
3. The entity used for “&”
a. &
b. &er
c. &s
d. None of the above
4. Which of the following is an requirement for well formed XML
a. XML declaration
b. XML definition
c. XML comments
d. None of the above
5. After the introduction of XML, data is sent as
a. Document
b. binary format
c. image
d. None of the above
Answers
NOTES
1.d 2. a 3. a 4. a 5. a
Part B
Short Questions
6. Explain anatomy of a XML file.
7. Explain the components of XML element.
8. Explain entities in XML.
9. What are the rules for well formed XML.
Part C
web services are business logic that resides on a remote machine which you can easily
NOTES access by using standard protocols like Hyper Text Transfer Protocol (HTTP), Transmission
Control Protocol (TCP) etc.
Now you would have a question that how web services are different from technologies
like RMI (Remote Method Invocation), CORBA etc.? There is a very basic difference
between these technologies and web services. The difference is as follows: The technologies
like RMI, CORBA etc are either vendor specific or platform specific whereas web services
are independent of platform, vendor etc. To use RMI, CORBA or DCOM (Distributed
Component Object Model) both the sender and receiver should be having something in
common like platform or vendor or technology.
But the web that you use is not homogenous i.e. there exist many technologies from
many vendors. But still you would like to establish communication between these. Web
services are there to help you to achieve this interoperability. The reason for achieving this
interoperability is because of the fact that web services are based on XML. You have
already learned that XML is not bounded to any specific operating system or technology
or vendor. So this neutrality bubbles up to web services from XML.
1.3.2.2 Characteristics of Web Services
After from the above said neutrality factor Web Services do have various other
characteristics. This section will explain you various characteristics of web services.
1.3.2.2.1 Loosely Coupled
Web is client served based. Normally the client and server on web technologies are
tightly coupled i.e. any modification in the server side interface would require one or more
modifications at client side also. In the case of Web services this is not true. Here the
requestor or consumer of the service and provide of the service are loosely coupled.
Because of this loosely coupled nature the application becomes easily maintainable i.e.
you can modify the server interface and still client would be able to access the services
provided that certain level of integrity is maintained. More on this is explained in later
portions of this text.
1.3.2.2.2 Synchronous or Asynchronous
By synchronous we mean that, once the client calls a service then it would wait until
the server execution completes. In the case of Asynchronous client doesn’t wait until the
server execution. Web Services can be invoked either as Synchronous or Asynchronous.
The choice is left to the user. So the user can decide based on the situations whether to call
a service in synchronous mode or in asynchronous mode.
Web services can be built using variety of technologies. To name a few you can
develop web services in J2EE, .NET, PHP, Perl etc. So to implement a web service,
developer can choose his/her own technology. This makes the development of web services
an easier task. NOTES
1.3.2.2.4 Discoverability
Web services can reside anywhere on internet. So it becomes mandatory that they
should be discoverable. Web Services achieve this discoverability through UDDI (Universal
Description, Discovery and Integration). By this the location of the web services becomes
insignificant because they are discoverable from any where on internet.
The web services can be modeled with three basic components. They are Service
provider, Service broker and Service requestor. The relationships among these three
components are shown in the following figure
Service
requestor
Discover Service
Invoke Service
Service
Broker
Register Service
Service
Provider
The role of service provider is to develop and deploy the web services. The service
provide also defines the services. The role of service broker is registration and discovery
of services. The primary role of service requestor is to invoke the web services. Here the
roles are given very briefly. More on web services would be discussed in later portions of
this text.
SOAP (Simple Object Access Protocol) packages the XML for transfer between
various clients. The actual XML contents would be overlapped by SOAP structure. By
this it becomes very easy that any SOAP client can easily access this because of the
generality nature.
WSDL is Web Service Definition Language. As the name indicates it defines the web
services invocation methodology, parameters etc. WSDL make the interaction between
client and the web services smoother.
Other than the above specified techniques, there exist various other technologies like
WSCI (Web Services Choreography Interface), WSFL (Web Services Flow Language),
DSML (Directory Services Markup Language) etc.
1.3.3 Soap
SOAP (Simple Object Access Protocol) is an XML based protocol. During the
discussion on Web Services you learned that SOAP acts as a packaging layer. SOAP
provides set of rules for moving data.
Before the development of SOAP, there were many similar technologies like
Microsoft’s Distributed Component Object Model (DCOM), Java Remote Method
Invocation (RMI) etc. The difference between these technologies and SOAP is that, SOAP
is outside the boundaries of development technologies and platform. Other than these
technologies, there are certain XML based protocols similar to SOAP. Few such protocols
are listed below:
XMI ( XML Metadata Interchange)
XML RPC ( XML – Remote Procedure Calls)
WDDX ( Web Distributed Data Exchange)
JABBER
The complete description of the above protocols is outside the scope of this text. But
one thing for sure, SOAP includes the advantages of many of these technologies NOTES
As the name indicates SOAP is “Simple”. By saying simple it doesn’t mean that it
lacks other features like security and reliability. To be precise SOAP is both simple and
powerful.
SOAP operates on top standard internet protocols like HTTP, FTP and SMTP etc.
SOAP derives its interoperability nature from XML. Surely you can say SOAP is one of
the most powerful technologies in the XML family. The following diagram indicates SOAP
position.
XML
SOAP
From the above Figure you can understand that SOAP is used in combination with
the standard internet protocols like HTTP, FTP etc. The advantage that you get because of
this is the SOAP messages penetrate firewalls. The firewalls are normally configured to
allow these communications. So you can achieve this power of accessing across various
networks with the help of SOAP.
The Service Oriented Architecture services are loosely coupled in nature. Because
this loosely coupled nature parts of application can be modified without worrying much NOTES
about the integration issues. This becomes a very big advantage in large enterprise application
development process.
Questions
Part A
c. Loose coupling
NOTES d. All of the above
Answers
1.a 2. b 3. c 4. a 5. d
Part B
6. Explain the discoverability feature of web services
7. What are all advantages of SOA.
8. Explain the components of SOAP.
9. What are the characteristics of Web services.
10. Explain the web services model in detail.
NOTES
UNIT II
XML TECHNOLOGY
2.1 XML NAMESPACES & STRUCTURING
As it has been already stated, XML provides lots of advantages like using customized
tags etc. This customized tags leads to a potential problem which is called tag duplication.
In other words, if more than one developer is working with the same XML file and
they are using their own tags, there is a possibility for duplication of tags. To avoid this
duplication problem the Namespaces are used.
Initially namespaces were not part of original XML specification. They are added at
a later point. You can find the XML namespace specifications at http://www.w3.org/TR/
REC-xml-names/.
Namespaces identify the tags with specific groupings. For example tags like <firstname>
can be used by one than one sources. In this situation namespaces would provide the
identification regarding the belonging of corresponding tag.
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
</player>
</team>
Let us assume that you would like to insert your opinions on players in this XML file.
But you don’t want to disturb the original flow of the XML document. You can very well
do that by using namespaces. The source file with namespaces inserted in to it is as given
below:
<player>
<name> Rahul </name>
<age> 33 </age>
<me:comment> Consistent Player </me:comment>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
<me:comment> Committed Player </me:comment>
</player>
</team>
Figure : Namespace in XML
NOTES
The output of above XML listing is shown in the figure. The URL given in the namespace
is used to provide the documentation regarding the namespace. Though an URL is given
there as attribute it is not mandatory that it points to an actual page. But it is a good
practice to give a live URL over there.
It is also possible to use namespaces at the child node level also. The below given
example illustrates this fact. In this example only for the first player the tag comment has
been added. Note that the namespace me has been defined only in that place.
NOTES <player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
</player>
</team>
The output for the above listed has been shown in the following figure. The point that
we would like to insist here is that namespaces can be defined at any depth. It is not
mandatory to define the namespace at the root level.
When you are using multiple namespaces it is also possible to make a particular
namespace as default one i.e. it not required to use that particular namespace explicitly. If
no namespace is given this would be assigned by-default.
<body>
<center>
<h1>
This is a simple html file
</h1>
</center>
</body>
</html>
Figure: Default namespace
To make a namespace default use the xmlns attribute without any prefix. Note in the
above example the xmnls attribute used with html tag does not has any prefix. So it is
considered as default namespace.
Namespaces are not compulsory with all the XML files. But there are certain situations
where namespaces becomes effective tool:
When the particular XML file has the possibility of getting linked with other XML
files in future. In such a scenario if those XML files also use tags with same name,
namespaces becomes mandatory.
Namespaces are widely used with XML technologies like SOAP and WSDL etc.
XML allows you to create your own tags and structure. This freedom is very much
the reason for success of XML. At the same time, to process XML file using various
applications requires it to follow certain rules regarding its structure. This structuring of
XML files becomes very much important when it get shared across multitude of parties.
The structuring of XML can be done through various techniques as given below:
Both of the above given techniques allows you to validate an XML file. Validating
means whether the XML file follows the rules given in its structure definition or not. Look
at the following XML listing.
<?xml version="1.0" encoding="UTF-8"?>
<player>
<name> Dhoni </name>
<age> 26 </age>
</player>
Suppose if your processing applications requires that the input XML file should strictly
adhere to this structure then the following XML file would become invalid.
In the above example a new tag <category> has been added which is not there in its
version given previously. Due to this additional tag the processing applications can cause
certain problems. To avoid these types of problems the validations of XML becomes an
important task.
This section would focus on Document Type Definitions (DTD). The next section
would focus on schemas.
As stated above, XML validation can be done in more than one method. Document
Type Definition is the earliest method of validating XML file. Indeed this method has been
derived directly from the ancestor of XML i.e. SGML (Standard Generalized Markup
Language).
The developers who have an understanding of SGML can easily cope up with
Document type definitions. There exist two ways in which Document Type Definitions can NOTES
be used in an XML file. They are given in the following figure.
These classifications have been made based on the location where the Document
Type Definition is located. Whether it is external or internal DTD holds certain rules that
the attached XML file has to follow.
In the case of internal DTD, the definitions are located in the same file itself. The rules
that the XML file has to follow is given in the XML file itself. Generally they are given
immediately following the <?XML> declaration. This type of declaration is not used that
much popularly as the external document type definitions.
For example for the file given in the beginning of this section, the rules can be given in
the same file itself as shown in figure
NOTES <player>
<name> Dhoni </name>
<age> 26 </age>
<category> Wicket Keeper </category>
</player>
The above code when viewed in internet explorer gives the following output.
When the document type definition is given as a separate file and it is linked with the
xml file then it is called external DTD. Generally the definition files are stored with the
extension .dtd. The following example lists an XML file with external DTD.
<player>
<name> Dhoni </name>
<age> 26 </age>
<category> Wicket Keeper </category>
</player>
Here the document type rules are given in the file named “playerrule.dtd”. The content
of this file is as shown below: NOTES
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT category (#PCDATA)>
<!ELEMENT player (name, age, category) >
Internal and external DTDs can be combined together in a single XML file. Example
for this type is given the below Figure.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE player SYSTEM "playerrule.dtd" [
<!ELEMENT zone (#PCDATA)>
]>
<player>
<name> Dhoni </name>
<age> 26 </age>
<category> Wicket Keeper </category>
<zone> South </zone>
</player>
In the above example you can notice that the external declarations are given in the file
“playerrule.dtd” and the definition is extended with an additional element zone for which NOTES
the definition is given as internal.
Declaring Elements
Note that, here the keyword ELEMENT should be given in upper case similar to
DOCTYPE. Otherwise it would through an error.
Here the keyword PCDATA indicates that this element can hold parsed character
data. The keyword PCDATA should be preceded with the character “#”. An element
declared as PCDATA can hold any character data at the same time it can’t hold child
elements. This is illustrated in the following Figure.
In the above example, name has been defined as #PCDATA. At the same time the
NOTES name element has been given with two child elements which is invalid because of the
PCDATA type.
At the same time empty elements are allowed with the PCDATA. For example the
following is valid:
<name></name>
When the content specification is given as ANY, it can hold any value. The difference
between PCDATA and ANY is that the later accepts child elements. This has been illustrated
in the following example.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE player[
<!ELEMENT name ANY>
<!ELEMENT age (#PCDATA)>
<!ELEMENT category (#PCDATA)>
<!ELEMENT player(name, age, category)>
]>
<player>
<name>
<first_name> M S </first_name>
<last_name> Dhoni </last_name>
</name>
<age> 26 </age>
<category> Wicket Keeper </category>
</player>
In the above example the content specification for “name” is given as “ANY” which
means that it can accept child elements as well. The usage of “ANY” should be done
carefully. Otherwise it can lead to much liberalized rules which would in turn make the
usage of DTD itself less effective.
Here the player element holds three child elements namely name, age and category. The
order of elements is also important. NOTES
Various Quantifiers
There exist various quantifiers which can be given in combination with elements. For
example when the quantifier “+” is given it means that the particular element can appear
one or more times. An example is given below:
In the above example “category” is given with “+” quantifier meaning that it can be
repeated for any number of times.
The “?” quantifier is used in those situations where the number of instances is zero or
one at the maximum. If the number of instance is more than one then it would become
invalid.
Another quantifier which can be used is “*”. The “*” quantifier allows the child element
to appear zero or more times. For example
When there are many choices it can be given separated by the character “|”. For
example
Attribute Declarations
One of the important components of any XML is “attributes”. DTD can be used to
structure the attributes as well. To accomplish this <ATTLIST> is used. The general syntax
of ATTLIST is as given below:
Here the data type used for attributes is CDATA (character data). There are various
other types also exists like IDREF (it requires a unique id for the specified attribute), NOTES
ENTITY (allows for an entity to be provided) etc.
Apart from these the values of the entities can be controlled using following keywords.
Now if the last_name attribute is missing then it would become invalid where as
first_name is not mandatory.
#IMPLIED: If the attribute is specified as #IMPLIED then the value is not
mandatory. By default it would be considered as “null” value. For example
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT name EMPTY>
<!ATTLIST name first_name CDATA #IMPLIED>
<!ATTLIST name last_name CDATA #REQUIRED>
#FIXED: Sometimes the values of attributes do not change. For those kinds of
attributes #FIXED can be given. For example
An XML entity replaces a symbol with character string. By default it provides various
entities like & for &, < for < etc. Apart from these you can define your own entities.
For example
The output of above XML listing is as shown below in Figure In the figure you can see that
the entities are replaced with the actual string values. NOTES
As described in the previous section DTDs are used to validate the structure of a
XML listing. Apart from DTD there is another technique to do this called XML schema.
Before discussing XML schema let us first look in to the drawbacks of DTD method.
DTD doesn’t allow specifying the number of times an element has to appear.
NOTES Though there are qualifiers like +, * etc., it doesn’t has provisions to exactly specify
the frequency of occurrences.
Another disadvantage of DTD is that it doesn’t have provisions to reuse the set of
elements defined.
All these drawbacks lead to the usage of XML schema instead of DTD. XML schema
provides much finer control over the document.
2.1.3.2 XML Schema Introduction
W3C has accepted XML schema as a recommendation in 2004. The biggest
advantage of XML schema is that it follows the XML syntax. Indeed schema document
itself is an XML file and normally saved with “.xsd” extension.
XML schema document has an XML declaration. The simple form of this XML
declaration is as shown below:
<?xml version="1.0" encoding="UTF-8"?>
Each XML schema document has a root element. The root element is <xs:schema>.
The simple form this root element is as follows;
<xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”>
<!-- The actual definitions -->
</xs:schema>
Apart from this, <xs:schema> can have other attributes as listed below:
Attribute Usage
attributeFormDefault Indicates whether the “attributes” of the instance
document needs to be prefixed. The possible values are
qualified
unqualified.
If the value is qualified then the attribute’s prefix
becomes necessary otherwise they are not necessary.
elementFormDefault Indicates whether the “elements” of the instance
document needs to be prefixed. The possible values are
qualified
unqualified.
If the value is qualified then the element’s prefix
becomes necessary otherwise they are not necessary.
Version Used to indicate the version of the schema document
xml:lang Indicates the language used in the XML schema
document.
56 ANNA UNIVERSITY CHENNAI
XML AND WEB SERVICES
XML Schema provides much finer control over the type of data that is valid for a
particular element. It depicted in the following example.
<Runs xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
</xs:schema>
In the above XML schema document you can notice the <xs:element> with type
attribute. The “type” attribute indicates which type of value is valid for the particular element.
This type of data restriction is not possible with DTD.
The sample list of data types are given in the following table.
Data Type Description
Boolean Used to indicate the binary valued elements.
Date To indicate a date value
Decimal A number (+ or -)
Double A floating point number with double precision
Float A floating point number with single precision
String Text value
Time Indicate the time instance
Generally XML schema documents are verbose in nature. To ease the work for the
NOTES developer, there are many tools. Examples for such tools are XMLSpy, Microsoft Visual
Studio etc. These tools make the developer’s task simple by providing various options
through graphical editors and Integrated Development Environment.
XML Schema documents play a vital role in the development of Web Services, RSS
Feeds etc.
Questions
Part A
Answers
1. a 2. a 3. b
4. c 5. b
Part B
NOTES
Short Questions
PART C
Descriptive type Questions
11. Explain the XML schema usage with a clear case study.
12. Create a DTD specification which would specify rules for maintaining student
information.
2.2 XML PRESENTATION
Through out this text, it has been repeatedly mentioned that XML is a data
representation language. If you view XML in a browser like Internet Explorer or FireFox
it would simply display it in a tree structure which can be folded or unfolded at elements.
At the same time, XML can be presented in a smoother manner by using techniques
like Cascaded Style Sheets (CSS). This section would elaborate how to render a XML
file with style specifications in CSS. The following steps are involved in this process.
Creating the Source XML file
Creating the CSS file
Linking XML and CSS
CSS Definition
In the above CSS definition you can notice that style has been defined for each type
of element specified in the XML file.
2.2.2 Linking the CSS and XML
The most important step in this process is the linking of CSS and XML. The CSS
Linked XML file is as shown in the figure.
XML output
2.2.4 Using Class specific Selector
The selectors can be specified for a particular class. They can be used in the XML
file with “class” attribute.
The example for this is as shown in the following figure.
Apart from this, you can use “in-line style” also. In this case the style information is
given is given directly in the XML file itself. This method can be avoided because it creates
the maintenance problems for the styles. If the style has to be modified then it requires
changes in too many locations.
In the previous section, presentation of XML with the help of CSS was explained.
This section would focus on a more advanced tool called XSLT. XSLT refers eXtensible
Style Language Transformations.
XSLT can be defined as a transformation language for XML. The output of XSLT
transformation can be HTML or text or even XML itself.
To perform XML transformation you need at least two files. They are
The source XML file
XSLT style sheet
Consider the following XML file and XSLT style sheet.
NOTES
XSLT File
Linking of XML and XSLT can be done by adding a line in the source XML file as
shown in the following figure.
NOTES
In the example illustrated above two matches has been given; one for <team> and
NOTES another for <player>. If you look at the output file, you can find that the corresponding
XML file has been transformed with the style given in XSLT.
xsl:value-of
The <xsl:value-of> has a attribute called “select” which indicates the value to be
selected for display. Here this attribute is having the value “name”. So the “name” appears
in the output. If you change the same with “age” the output would look as shown in the
figure:
xsl:for-each
The “select” attribute would find the only the first match. If there are more than one
match then xsl: for-each can be effectively used. xsl:for-each would gather all the matches.
Apart from performing the client side transformation in the web browser, XSLT
transformation can also be performed using other methods listed below:
Server Side Transformation: Using a server side scripting language like JSP,
the XSLT transformation can be performed. A sample JSP program is as shown in
the figure.
NOTES
Standalone Programs: You can also write standalone programs in languages like
Java to perform XSLT transformation. A sample Java program is shown in the
figure.
The above examples are written using Java. But the implementation is not restricted
only to Java. You can use other languages also. This is one of the important advantages of NOTES
XML. XML is supported in almost all of the popular languages.
This section would highlight some of the important techniques of XML called XML
parser. XML Parsing
Another important aspect of XML is the ability to parse the XML document using
programming languages like Java, .NET etc. XML parsing is the process of breaking a
XML document in to components so that they can be handled programmatically.
There are two types of parsers available for XML. They are as shown in the following
figure.
DOM
There are various levels of DOMs recommended by W3C. They are DOM Level1,
DOM Level2 etc.
SAX Parser
SAX refers to Simple API for XML. SAX parser breaks the XML document in to
set of events. Examples for events are as listed below:
StartDocument,
NOTES StartElement,
EndElement,
SAXWarning,
SAXError,
EndDocument.
SAX is not a official W3C recommendation. Some of the useful concepts of SAX are
incorporated in to later versions of DOM. But SAX parser is much faster than the DOM
parser in handling larger documents. At the same time DOM is very efficient in handling
smaller XML documents.
Types of Parsers
XML parsers are classified in to two types. They are as shown in the following figure.
The XPath data model represent XML document. The root element of XPath data
model contains the entire XML document. This includes comments, processing instructions
that occur even before the root element or after it.
The XPath data model doesn’t include Document Type Definition. So the parts of
DTD are not accessible through XPath.
Location Path
A “location path” is used to identify a set of nodes in the XML document. The set
returned by location path might consist of a single node or collection of nodes. The set may
even be empty.
A location path consists of consecutive “location steps”. Each location step is specified
relative to a particular node called “context node”.
The simplest location path is the one that identifies the root element of the XML
document. The root location path is simply identified by “/”. (You can easily identify the
similarities between the path syntax of XPath and Unix path syntax. Recall the fact that
even in UNIX also, the root directory is represented by “/”). An example is as shown
below:
<xsl:template match="/">
<html><xsl:apply-templates/></html>
</xsl:template>
The other elements can be represented with location path. An example for representing
NOTES the inner nodes is as shown below:
<player>
<name> Rahul </name>
<age> 33 </age>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
</player>
</team>
XPath is often used in combination with XSLT. The below given XSLT document
extracts all the player names from the above given sample XML file.
<xsl:template match="player">
<P>
<xsl:value-of select="name"/>
</P>
</xsl:template>
</xsl:stylesheet>
The comments in the XML document can also be matched with the below given syntax.
<xsl:template match="comment()">
</xsl:template>
Figure : Comment Handling with XPath
Simillarly compound location paths can also be specified with combining node names
with “/” . For example “/team/player/age”.
Apart from this predicates can also be used to filter out nodes matching particular
condition.
An example code filter all the XML document nodes matching a particular condition
are as shown below. In this example when the value of the node “specialization” is “captain”
it is shown in bold letters otherwise normal formatting is applied.
<team>
<player>
<name>Dhoni</name>
<age> 26 </age>
<specialization>Captain</specialization>
</player>
<player>
<name> Rahul </name>
<age> 33 </age>
<specialization>Defensive Batsman</specialization>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
<specialization>Opening Batsman</specialization>
</player>
</team>
Figure : Source file used for filtering
</xsl:stylesheet>
NOTES
Apart from location path, there exist other types of XPath expressions which would
return numbers or strings as output. The arithmetic operators supported by XPath are as
listed below:
+
-
*
mod
div
2.4 XLINKS
XLink is used to provide links in a XML document. The XLink target is not restricted
only to XML documents. The link target can be other documents as well.
<team>
<player>
<name xmlns:xlink= "http://www.w3.org/1999/xlink"
xlink:type = "simple"
NOTES xlink:href =
"http://www.teamindia.com/cap.htm">
Dhoni</name>
<age> 26 </age>
<specialization>Captain</specialization>
</player>
<player>
<name> Sachin </name>
<age> 35 </age>
<specialization>Opening Batsman</specialization>
</player>
</team>
The xlink has various attributes as shown in the above example. For example type,
href etc. Some of them are optional and others are mandatory.
There exist another attribute called “show”. The possible values for this attribute are
as shown below:
new
replace
embed
other
XPointers
XPointers are used to locate specific portions in the XML document. The XPointers
are used in combination with XLink. This would make the connection between two
documents more precise and clear. XPointer would specify the links to specific portions
using the syntax as mentioned in the location path.
The above example would point to the name element specified by the XPointer. Here with
position equals to 1.
2.5 XQUERY
XQuery is more powerful than XPath, XLink and XPointers. Indeed XQuery provides
many of the language constructs.
XQuery treats the XML document similar to a relational database. The XQuery has
syntax features similar to SQL. The XQuery is defined by W3C as given below:
A query language that uses the structure of XML intelligently can express queries
across all these kinds of data, whether physically stored in XML or viewed as XML via
middleware.
XQuery provides sets of operators and functions which would facilitate the extraction
of data matching particular condition. There exist various XQuery processors available.
For example “Galax” is one of the popular XQuery processors.
Questions
Part A
c. class
NOTES d. none of the above
3. SAX stands for
a. Simple API for XML
b. Structured API for XML
c. Select Argument extension
d. None of the above
4. Which of the following is a XML parser?
a. XCode
b. Xerces
c. M-XML
d. None of the above
5. Linking of XML and XSLT is done by
a. <?xml-stylesheet>
b. <?xml-style>
c. <?xml-source>
d. none of the above
Answers
1.a 2. b 3. a
4. b 5. a
Part B
Short Questions
Part C
NOTES
UNIT III
SOAP
(Simple Object Access Protocol)
Objectives
Providing an overview of SOAP.
Introducing XML-RPC, its format.
Explaining the Anatomy of SOAP.
Introducing various actors in SOAP.
Introducing the Faults handling techniques in SOAP.
Explaining about the attachments with respect to SOAP.
3.1 INTRODUCTION
As it has been consistently mentioned in this text, XML is a platform neutral language.
This platform neutrality nature of XML can be effectively used in communication between
applications running in various platforms.
SOAP is an important keyword in the XML domain. It plays a critical role in message
communication between applications running in various platforms. The understanding of
SOAP becomes crucial in becoming web service developer. The power of SOAP is that it
is totally based on XML.
SOAP is supported by W3C and many vendors in the industry like Sun Microsystems,
IBM, HP etc. SOAP can be easily used with technologies like J2EE, Microsoft .NET etc.
This chapter introduces the fundamentals of Simple Object Access Protocol (SOAP)
and its applications.
The Simple Object Access Protocol (SOAP) can use any of the networking protocols
for communication. It doesn’t specify one single protocol for doing this. Initially SOAP 1.0
specification had indicated HTTP as transport protocol. In the later specifications it had
support for most of the widely used protocols (as listed below):
There exist various advantages of using HTTP protocol (apart from being the simplest
broadly supported protocol).
Support in Languages: Http has a wide support across various languages like
JAVA, C, php etc. There are various libraries exist in these languages to support
HTTP. This would be a crucial factor in making the application development easier
and faster. Because of this support in the programming languages, the programmer
is not required to build the code from the scratch.
Support across platforms: Http has support across various major platforms like
Windows, Solaris, and Linux etc. So the distributed nature of application would
not be restricted by a single domain. The communication between applications
running on these platforms can happen with out issues.
Firewall Pass - through: The http connections can pass through most of the
firewalls. It would be a major advantage in deciding http as a transport protocol.
Text Based: Http protocol is text based. So in case of testing the telnet can be
used to check the servers.
Usage of HTTP header: The header associated with the HTTP can be used
effectively to gather information like document – encoding etc. This is shown in the
following figure 3.1
The technologies like, CORBA and RMI etc which were developed prior to XML
RPC had some issues as listed below:
Complexity: These techniques were more complex comparing with XML- RPC.
Binary Nature: These techniques were binary in nature. So there were problems
while passing through fire-walls etc.
Platform Dependency: Another issue related to these techniques is their
dependency towards a particular platform. So it was not very easy to establish
communication between applications running on various platforms.
3.3.2 What XML-RPC is?
As stated earlier XML-RPC is a remote procedure call technique. This XML-RPC
has successfully made an attempt to solve the above mentioned problems like complexity,
binary nature and platform dependency etc.
The core concept of XML-RPC is as listed below:
1. There would be a XML document which contains a procedure name and arguments
(parameters)
2. This document would be sent to the web server using transport protocol like HTTP.
3. The web server would identify the procedure name and arguments given in the
NOTES source document and invoke that procedure in the server side.
4. The result of the procedure is constructed as a XML document and it would be
sent back to the client from where the original request came-in.
5. The above mentioned simple steps would be carried-out to make a XML-RPC.
6. An example for XML –RPC document is as shown in the following figure 3.2.
<?xml version="1.0"?>
<methodCall>
<methodName>getScore</methodName>
<params>
<param>
<value><string>India</string></value>
</param>
</params>
</methodCall>
By having a closer look at the above XML code snippet, you can observe the following
facts.
The root element here is <methodCall>.
This <methodCall> element has two child elements. They are
<methodName> : Indicates the name of the method to be called.
<params> : Indicates the arguments associated with this method. The <params>
can have various child elements called <param> each indicating an argument. The
values associated with the arguments would be supplied through <value> tag and
the data type would also be specified. In the above example the data type is
specified as <string>. There are other data types like
Int
Double
Boolean etc
These requests would be sent with a HTTP header. This http header would have
following data.
The complete XML-RPC request for the above given example may like the following.
NOTES
POST /xmlrpc HTTP 1.0
User-Agent: testXMLRPCClient/1.0
Host: 172.16.12.66
Content-Type: text/xml
Content-Length: 168
<?xml version="1.0"?>
<methodCall>
<methodName>getScore</methodName>
<params>
<param>
<value><string>India</string></value>
</param>
</params>
</methodCall>
The response for the above XML-RPC request would be generated by the server.
An example is provided below in the figure 3.5.
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value><string> 345 For 7 </string></value>
</param>
</params>
</methodResponse>
By having a closer look at the response you can observe the following facts:
The XML-RPC response is very similar to XML- RPC request.
The methodCall element is now replaced by the methodResponse element.
The XML-RPC response contains only parameter.
Similar to the XML-RPC request, the XML –RPC response also has associated
http header information. An example is shown in figure 3.6. ( This example shows
both the XML –RPC response message and the header)
It supports HTTP 1.0. The compatability is there for HTTP 1.1 as well.
The content type would be indicated as text/xml.
The XML-RPC responses uses the “200 OK” as response code.
The length of the response would also be indicated in the response header so that
it can be used in required locations.
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value><string> 345 For 7 </string></value>
</param>
</params>
</methodResponse>
Figure 3.6: XML- RPC with Header Information
When the execution of the method specified in the XML – Request fails that time, an
XML-RPC fault would occur. XML-RPC fault response is very similar to the normal
XML –RPC response except the fact that it would have <fault> tag instead of the <params>
tag.
At the same time, the <fault> element can also contain a maximum of only one value,
similar to the <params> element.
The XML-RPC fault response may contain an error code. An XML-RPC fault
response is as shown in the figure 3.7.
<?xml version="1.0"?>
<methodResponse>
<fault>
<value><string>No such method!</string></value>
</fault>
</methodResponse>
Figure 3.6: XML- RPC Fault
XML – RPC data structure
XML-RPC supports data structures like arrays, struct etc. They don’t have the
support for pointers.
Arrays:
To represent an array the “array” element is used. The array element would have only
one “data” element and in-turn this data element can hold zero or more value elements.
Each value element contains the corresponding data type element and the actual parameter
value. NOTES
<array>
<data>
<value><string>Dhoni</string></value>
<value><string>Rahul</string></value>
<value><string>Sachin</string></value>
</data>
</array>
Figure 3.7 - XML RPC Array
Unlike other programming languages, an XML RPC array can contain values from
different data types. An example is shown below:
<array>
<data>
<value><string>Dhoni</string></value>
<value><string>Rahul</string></value>
<value><int>125</int></value>
</data>
</array>
Figure 3.8 : XML RPC array with values from different data types
A XML –RPC request with array usage is shown below in figure 3.9.
<?xml version="1.0"?>
<methodCall>
<methodName>getScore</methodName>
<params>
<param>
<value>
<array>
<data>
<value><string>Dhoni</string></value>
<value><string>Rahul</string></value>
<value><string>Sachin</string></value>
</data>
</array>
</value>
</param>
</params>
</methodCall>
The XML-RPC response for the above request would look like as shown in figure
3.10.
<struct>
<member>
<name>sachin</name>
<value><int>145</int></value>
</member>
<member>
<name>Dhoni</name>
<value><int>158</int></value>
</member>
</struct>
There exist no standard validation techniques available for XML RPC. But, at the
same time, it can be validated with both of the below given techniques.
DTD Method
XML schema method.
Though you can use any one of the above mentioned techniques for validating a XML
–RPC, there are certain advantages of using, XML Schema method for validation as given
in the below list.
In XML-RPC, only the methodCall and methodResponse are the legal root
elements. The XML Schema can clearly specify this restriction where as in DTD it
is not so.
The values for data types (their ranges) can be easily specified by schema method.
The method names and strings should contain only the ASCII values. This restriction
also can be clearly imposed by the schema method.
As the name suggest, SOAP is a simple protocol. SOAP is based on XML. The
purpose of Simple Object Access Protocol is to allow applications exchange data over
HTTP.
SOAP is playing a major role in the development of Web Services. In that view you
can call SOAP, a protocol for accessing the web services. After the web services dominance,
SOAP has become the de-facto standard for accessing applications in a network.
The below given list explains the attributes of Simple Object Access Protocol.
SOAP is a protocol primarily designed for communication between applications in
a networking scenario.
SOAP gives a format for sending messages over a network. Since this format has
been accepted by majority of software vendors, it becomes very easy to establish
communication between applications.
The biggest advantage of SOAP is its platform independency. It enables seamless
integration of applications running across various platforms. For example an
application running in Microsoft Windows Operating system can communicate
with another application running under some other operating system like Solaris or
Linux etc.
There exist many misconceptions regarding SOAP. This section would give an idea
about what SOAP is not?
The important point to be noted about SOAP is that, it is not a programming
language.
It is not a business application component which can be directly used with business
application development.
SOAP doesn’t provide any Garbage Collection feature.
SOAP doesn’t support Object activation and Object by reference
It doesn’t have support for message batching.
A SOAP message is basically an XML document with certain specific elements. The
following figure 3.12 depicts various components of a SOAP message.
NOTES
In the above listed components, Header and Attachments are optional. The envelope
and Body are mandatory.
SOAP Envelope
The SOAP envelope is the primary container of a SOAP message. It is the root
element of the message.
As per SOAP 1.1 specification, the SOAP messages which don’t have this envelope
as container would be considered as invalid.
Encoding styles can also be present in the envelope. The encoding style attribute is
used to represent the data types used in the document. The encoding style can be
specified on any attribute. It would apply for that element and its child elements.
By default SOAP message doesn’t has any encoding.
It would have the namespace as indicated in figure 3.13.
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
...
Message information goes here
...
</soap:Envelope>
SOAP Header
NOTES
As specified earlier, SOAP header is a optional component. It would be represented
as the first child element of envelope specified.
The header can contain optional child elements.
If the child elements are there they should be qualified with a namespace.
An example for SOAP header is provided in Figure 3.14
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Header>
<m:Trans
xmlns:m="http://www.test.com/transaction/"
soap:mustUnderstand="1">888</m:Trans>
</soap:Header>
...
...
</soap:Envelope>
SOAP has defined three attributes in the default namespace. The attributes are as listed
below:
Actor
mustUnderstand
encodingStyle
The role of these attributes is to indicate the recipient on how it should process the
message.
Actor attribute
At times, the complete message may not be for a single end-point. It may be for
more end points on the path. The actor attribute is used to address a particular endpoint.
The general format for specifying the actor attribute is as shown below:
Soap:action = “URI”
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Header>
<m:Trans xmlns:m="http://www.test.com/transaction/"
soap:actor="http://www.test.com/test/">888
</m:Trans>
</soap:Header>
...
...
...
...
</soap:Envelope>
Figure : 3.15 Actor attribute
mustUnderstand attribute
The next attribute in SOAP header is the “mustUnderstand” attribute. The possible
values for this attributes are “0” or “1”. If the mustUnderstand attribute is set to “0” then
the recipient can carry on the processing even it doesn’t recognize what it refers to. On the
other hand when the mustUnderstand is set to 1, the receipient has to recognize the element.
Otherwise it would fail while processing the header. The general syntax for mustUnderstand
is as shown below:
soap:mustUnderstand = “0|1”
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Header>
<m:Trans xmlns:m="http://www.test.com/transaction/"
soap:actor="http://www.test.com/test/"
soap:mustUnderstand = “1”>
888
</m:Trans>
</soap:Header>
...
NOTES ...
...
...
</soap:Envelope>
encodingStyle attribute
The encoding style attribute is used to define the encoding of the data types used in
the header element entries. As specified earlier, the SOAP message has no encoding by
default.
Soap:encodingStyle = URI
SOAP Body
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body>
<m:GetRuns xmlns:m="http://www.onedaycricket.com/scores">
<m:Team>India</m:Team>
</m:GetRuns>
</soap:Body>
</soap:Envelope>
</soap:Envelope>
If you observe the above given example, you can notice the elements m:GetRuns and
m:Team. These elements are specific to a particular application. So they are not part of the NOTES
original SOAP standard.
The above example, request for score of the Indian team in a cricket match. The
response to this message can be similar to as shown below in figure 3.18.
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body>
<m:GetRunsResponse xmlns:m="http://www.onedaycricket.com/scores">
<m:Team> 350 For 7</m:Team>
</m:GetRunsResponse>
</soap:Body>
</soap:Envelope>
</soap:Envelope>
Figure 3.18 : SOAP Response message
SOAP Fault
The SOAP fault element is used to handle the errors. It would be used to identify the
status information.
The SOAP fault element would appear as a child element for BODY element.
The SOAP fault element can appear only once in a SOAP message.
The SOAP fault element can have the following sub elements. They are as listed
below:
Faultcode: The faultcode would contain a standard value which can be used for
identifying errors. (or the status information). The fault code values are as given
below:
VersionMismatch: Indicates that an invalid namespace is defined or the version
is not supported.
MustUnderstand: The header element with the mustUnderstand value set to “1”
is not understood.
Client: This faultcode is indicated when the problem originates from the receiving
client.
Server: This fault code is indicated when the problem arises during the processing
on server side.
The fault code values are as shown in the following figure 3.19:
NOTES
XML RPC is there in the industry for a considerable period of time. So in some
aspects it is more stable.
The greatest advantage of XML RPC is “simplicity”. Learning of XML RPC is
less complicated. It has short learning curve.
Advantages of SOAP
The advantages of SOAP are as listed below:
Though XML RPC supports arrays and structs they are un-named. At the same
time SOAP structs and arrays can be named.
Customization is the greatest advantages of SOAP. This would make the developer
to feel comfortable while creating customized messages.
The support from industry leaders for SOAP is another big advantage for SOAP.
For example, Microsoft has given importance to SOAP in their .NET framework. NOTES
Similarly technologies like J2EE also support SOAP.
It supports developer specified character set.
It support developed defined data types.
It has support for message specific processing instructions.
Disadvantages of SOAP
The documentation support associated with SOAP is limited.
Considering all the factors, SOAP is certainly more powerful than XML RPC.
A SOAP method is nothing but a HTTP request or response with a special characteristic
that it has to comply the encoding rules of SOAP. This can be specified as shown below in
figure 3.20.
SOAP request can be HTTP POST or HTTP GET request. The HTTP POST request
shall have at least two headers. They are as listed below:
Content-Type
Content- Length
Content-Type: Content-Type is used to specify the MIME type for the messages.
NOTES It may have optional item called charset. It would indicate the charset associated wit the
body of request or response.
Content-Length: The content length would indicate the number of bytes in the body of
the request or response. An example is as shown below:
Content-Length: 300
In the above diagram, you can notice that the HTTP header information hold
information like HOST, Content-Type and Content-Length.
A possible response for the above SOAP request is as shown in Figure 3.22
HTTP/1.1 200 OK
Content-Type: application/soap+xml; charset=utf-8
Content-Length: 327
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body>
<m:GetRunsResponse
xmlns:m="http://www.onedaycricket.com/scores">
In you look at the above response message; you can notice the presence of “200
OK” which indicates successful HTTP response.
The HTTP header has additional information like Content-Type, charset and Content
Length.
SOAP Intermediary
SOAP Sender: The node from which the message starts its journey.
SOAP Receiver: The node to which the message is intended for. The SOAP
receiver would process the message received. It can response with process
output or with a SOAP Fault.
SOAP intermediary: The SOAP intermediary nodes can both receive and
send SOAP messages. The SOAP intermediaries are optional. There may be
a communication in which the message from the sender can reach the receiver
with out going through any of the intermediaries.
The SOAP intermediaries can be classified in to two types. They are as shown in
NOTES figure 3.24.
As stated earlier, the root element of a SOAP message is “envelope”. The contents
inside the SOAP envelope should strictly follow the rules of XML. There is a part outside
the SOAP envelope called “SOAP attachments”.
SOAP attachments can contain data in ASCII or binary format.
SOAP attachments are not part of SOAP envelope.
Though the attachments are outside the envelope, they are related to the message
sent.
Each attachment of the message can be identified with an ID called Content-ID.
The attachment can be identified with content locations as well.
The attachments allows any kind of data to be associated with a SOAP message
which is very helpful in scenarios where you would like to send an image or some
other file with SOAP message.
98 ANNA UNIVERSITY CHENNAI
XML AND WEB SERVICES
Questions
Part A
Multiple Choice Questions
1. SOAP Stands for
a. Service Oriented Architecture Protocol
b. Simple Object Access Protocol
c. Serial Object Access Protocol
d. All of the above
e. None of the above
2. Find the odd item
a. XML RPC
b. JAVA RMI
c. SOAP
d. AJAX
3. HTTP
a. is independent of the platform
b. text based
c. can pass through firewalls
d. all of the above
e. none of the above
4. CORBA is
a. Complex
b. Platform dependent
c. binary
d. all of the above
e. none of the above
5. SOAP is
NOTES a. Platform dependent
b. complex
c. binary
d. all of the above
e. none of the above
6. Which of the following played a key role in SOAP origin?
a. Developmentor
b. DevelopSoft
c. DSoft
d. All of the above
e. None of the above
7. Which of the following is not a component of SOAP message?
a. Envelope
b. Header
c. Body
d. All of the above
e. None of the above
8. Which of the following is not a valid “mustUnderstand” value?
a. 1
b. 0
c. -1
d. all of the above
e. none of the above
9. Find the odditem
a. VersionMismatch
b. MustUnderstand
c. Client
d. Version
10. Which of the following is not there in http header
a. Content-Type
b. Content-Length
c. Content-Code
d. All of the above
e. None of the above
Answers
1. b 2. d 3. a 4. d 5. e NOTES
6. a 7. e 8. c 9. d 10. c
Section B
Short Answer Question
11. List out various communication protocols.
12. Explain the structure of XML-RPC response
13. List out various component of SOAP message.
14. Write short notes on SOAP Header
15. Write short notes on SOAP Attachments.
16. List out components of SOAP Fault.
17. Write short notes on SOAP intermediaries
Section C
18. Explain the working mechanism of XML RPC
19. Explain the SOAP components in details.
20. Compare XML RPC with SOAP.
NOTES
NOTES
UNIT IV
WEB SERVICES
4.1 INTRODUCTION
As we learnt from the UNIT I, web service is nothing but code sequences to solve a
problem that doesn’t reside on the same machine where we are executing the program to
solve a particular problem. In other words it can be perceived as the program components
that reside in some portion of the internet which can be accessed by standard internet
technologies like HTTP from a remote place to solve the problem. The objective here is to
establish communication between various technologies irrespective of platform or the
product. Is it possible? Yes, it is possible through the web services. The characteristics of
web services allow businesses to use the internet to publish, determine and aggregate
other web services using the protocol SOAP.
4.2 WHAT IS A WEB SERVICE?
Let us the see the formal definition again for the web Service:
A Web service is a software system designed to support interoperable machine-
to-machine interaction over a network. It has an interface described in a machine-
processable format (specifically WSDL). Other systems interact with the Web service
in a manner prescribed by its description using SOAP messages, typically conveyed
using HTTP with an XML serialization in conjunction with other Web-related
standards
Is it confusing? Don’t worry. Let us make it clear: Over the years, three primary
technologies have emerged as worldwide standards that make up the core of today’s web
services technology. These technologies are: SOAP, WSDL and UDDI. Follow the
definitions for each of them.
4.2.1 Simple Object Access Protocol (Soap)
SMTP, HTTP and FTP are some of the standard Internet Technologies for transporting
contents or documents. But SOAP acts as a standard cover over XML documents which
wraps and makes it ready for transporting. Is it the only job done by SOAP? Not so, it
also defines encoding and binding standards. What is the use of it? The standards are used
for encoding non-XML RPC invocations in XML for transport. The structure SOAP
NOTES provides is simple for doing RPC: document exchange. The result of this is the heterogeneous
clients and servers can easily become interoperable by having a standard transport
mechanism. For example,
NET clients can invoke EJBs exposed through SOAP
Java clients can invoke .NET Components exposed through SOAP
Yes it is through WSDL where WSDL is an XML technology that describes the
interface of a web service in a standardized way. WSDL allows the clients to automatically
understand how to interact with a web service.
dynamic, standard infrastructure for enabling the dynamic business of tomorrow. Combined,
these technologies are revolutionary because they are the first standard technologies to NOTES
offer the promise of a dynamic business.
Now the question may arise whether the equivalent feature was available in the past?
Yes it was available, but they weren’t supported by every major corporation and did not
have a core language as flexible as XML.
Now look into the figure 4.1: it will be very clear for you people to understand the
connection between the three technologies and their interactions. Fig 4.1 depicts the
communication across web using Web services, XML and SOAP using the Repository
UDDI. In other words, the diagram demonstrates the relationship between these three
technologies. It explains the web services which builds on SOAP can be exposed to the
interested parties over the internet from any web connected device. This paradigm is based
on the approach “assembly of constituent parts”. SOAP is not a stand alone technology,
but the result of synergies between XML and HTTP.
How exactly these technologies work together to solve a problem using the services?
The following are the sequence of steps for a client application to locate another application
or a piece of business logic located somewhere on the network:
The client queries a UDDI registry for the service either by name, category, identifier,
or specification supported.
Once located, the client obtains information about the location of a WSDL
document from the UDDI registry.
The WSDL document contains information about how to contact the web service
NOTES and the format of request messages in XML schema.
The client creates a SOAP message in accordance with the XML schema found in
the WSDL and sends a request to the host (where the service is).
In other words the same may be explained with the functionalities such as Describing,
Exposing, Being invoked and Returning & Response.
Describing (WSDL)
Web Services describes its functionality and attributes so that other applications can
locate it and use it.
Exposing (UDDI)
The Web Services have to register with a repository which may contain white pages,
Yellow pages and Green pages. All the basic service provider information will be available
on the White pages, the listing of services category wise will be the content of Yellow
pages and Green page will contain the information about how to connect and use the
services.
Being invoked
Whenever a web service has been located, a remote application can invoke the service.
Returning & Response
When a service has been invoked, results are returned to the requesting application.
As a whole what is being achieved with these functions? As we know the need for
flexible and efficient business collaboration environment in the industry, these concepts
comes in handy as a solution. Technically it is a way to link loosely coupled systems using
technology that doesn’t bind them to a particular programming language. This ensures the
idea of component assembly which also promises improved collaboration with customers,
partners and suppliers.
4.2.4 What qualifies as Web Services?
over the web rather than object-to-object connections over limited networks. Web services
also promises the improved collaboration with customers, partners and suppliers. NOTES
We know that it is not possible to get additional significant potential services without
paying any thing. What is it here? How the web services will play on a large scale?
Delivery of simple services is alright. But until the technology matures, an up-front human
element to solidify agreements is required. Without worrying about it let us concentrate on
the details of the applications and web service protocols.
4.2.5 Practical Applications for Web Services
Imagine a Person requires a currency conversion service that converts dollars to
Euros or Rupees to Dollars. Another person requires a natural language translation
service that converts English to French. With the availability of technological
advancement the above said piece of component can be achieved through the
cross-platform interoperability promised by SOAP and web services. Today,
some web sites are available such as www.xmethods.com to host simple web
services.
When we see real companies using web services to automate and streamline their
business processes, this scenario becomes more exciting.
Let’s use the concept of a Business-to- Consumer (B2C) portal. Take a closer
look into the web-based portals, such as those used by the travel industry. They
often combine the offerings of multiple companies’ products and services and
present them with a unified look and feel to the consumer accessing the portal.
One thing we have to have it in our mind is to realize the difficulties to integrate the
backend systems of each business to provide the advertised portal services reliably
and quickly.
For example assume there are two companies available namely MAP CAR SYSTEMS
and PAM AIRLINES COMPANY. And web services technology is already being used in
the integration between MAP CAR SYSTEMS and PAM AIRLINES COMPANY. MAP
CAR SYSTEMS uses the Microsoft SOAP Toolkit to integrate its online booking system
with PAM AIRLINES COMPANY’s site. MAP CAR SYSTEMS booking runs on a
Sun Solaris server, and PAM AIRLINES Company’s site runs on a Compaq OpenVMS
server. The net result is that a person booking a flight on PAM AIRLINES’s web site can
reserve a car from MAP CAR SYSTEM’s without leaving the airline’s site. The resulting
savings for MAP CAR SYSTEMS are a lower cost per transaction. If the booking is done
online through PAM AIRLINES and other airline sites, the cost per transaction is about
$1.00. When booking through traditional travel agent networks, this cost can be up to
$5.00 per transaction.
Let us look into some other application area such as the healthcare industry in
which web services can be put to use effectively. A doctor carrying a handheld
device can access your records, health history, and your preferred pharmacy using
a web service. The doctor can also write you an electronic prescription and send
NOTES it directly to your preferred pharmacy via another web service. Imagine a situation
where all pharmacies in the world use a standardized communication protocol for
accepting prescriptions, the doctor could write you a subscription for any pharmacy
that you selected. The pharmacy would be able to fulfill the prescription immediately
and have it prepared for you when you arrive or couriered to your residence.
This model can be extended further in the same application domain. If the interfaces
used between doctors and pharmacies are standardized using web services, a
portal broker could act as an intermediary between doctors and pharmacies
providing routing information for requests. In addition it can better meet the needs
of individual consumers. For example, consider a situation where a patient may
register with an intermediary and specify that he wants to use generic drugs instead
of expensive brand names.IN this situation, an intermediary can intercept the
pharmaceutical web service request and transform the request into a similar one
for the generic drug equivalent. In this process the intermediary exposes web
services to doctors and pharmacies (in both directions) and handles issues such as
security, privacy, and non repudiation.
In all these applications, the minimum requirement is that each participant in the
multiparty collaboration should know how to construct and deconstruct SOAP messages
and how to send and receive HTTP transmissions. Now that it is clear what are the situations
these technologies can be used? You can imagine also similar applications where these
technologies can be used to achieve the task.
4.2.6 Web Service Architecture
What are the major aspects of Web Service Architecture? There are three major
aspects: service provider, service requestor and broker: Let us explore each one in detail:
A Service Provider provides the software pieces that can carry out a specified
set of tasks.
A Service Requestor discovers and invokes a software service to provide a business
solution. Generally the requestor will invoke a remote procedure call on the service
provider along with the parameter data and receives a result in reply.
A Broker manages and publishes the services provided by the various service
providers.
How can we mange all these three aspects? Are there any underlying key technologies
available to readily handle this? Let us recall the definition we gave for UDDI, WSDL and
SOAP:
UDDI is a protocol for describing Web Services components that allows providers
to register with an Internet Directory to advertise their services.
WSDL is the proposed standard for describing Web Services. WSDL provides features
for defining service interfaces and the implementation. WSDL syntax is XML syntax based.
SOAP is a protocol for communicating with a UDDI service. The advantage of SOAP
is that it can use universal HTTP to make a request and to receive a response. NOTES
As the explanation indicates these technologies can be used to realize the three aspects
service provider, service requestor and broker which is depicted in Fig 4.2 . Let us look
into detail the technologies
Green Pages
WSDL White Pages
Yellow Pages
Web services
repository
WSDL
XML/SOAP
Web
Web services
services provider
WEB
client
Web
services
repository
UDDI
SOAP
HTTP
XML/SOAP
Web XML
Web
services
services WEB provider
client
4.3 UDDI
NOTES
UDDI should have the facility to uniformly describe the service description which is
stored in a directory and used by any services. UDDI originates from a cooperative
agreement among IBM, Microsoft and Ariba on an XML based specification for establishing
a registry of businesses and services on the Internet. In a nutshell, we can say that UDDI
defines an XML based infrastructure for software to automatically discover available services
on the web, using SOAP as the protocol to invoke services.
4.3.1 UDDI Registries
UDDI registries are the focal points for registering and locating services. The services
registered may be for the management of internal requirements of an organization or open
services for all others. Hence it may be known as a public registry or a private registry.
Microsoft, IBM and HP have agreed to provide a public UDDI registry which can be used
for search and connection across the entire internet. Many private registries are also
available which can be used for either internally within companies or among a closely knit
family of trusted partners and collaborators.
Hence the UDDI- complaint registry should provide an information framework for
the description of the both public and private web service registries. Because of this openness
many IT industries have started using web service technologies behind their firewalls for
application to application integration. This encourages the managers and developers to
gain experience by starting less critical projects and then migrate to more ambitious projects.
Let us see now the specifications to describe a service in the registry.
What to be specified?
These specifications outline the details of the XML structures associated with the
functions.
UDDI data structure specifications are
businessEntity
businessService
bindingTemplate
tModel
All these four data structures specify the structure of the service that can be used to
define the sequence of procedures to be included in the UDDI registry.
Example Scenario:
NOTES
The following is a scenario of interaction for connecting a server using UDDI discovery.
The domain assumed to explain the concepts is Books Service.
A company requires software which connects to several book service providers.
According to the requirement the software has to compare the prices, delivery
times, additional charge depending on the place of delivery etc.
This requires the connection to be established to the UDDI business registry.
It may be through the Web Interface or the Inquiry API.
After establishing the connection, a lookup based on an appropriate yellow
pages listing is required and finally the company obtains the businessEntity.
Using this businessEntity, based on the requirement the information can be obtained
as such or it can be drilled down for more detail. The objective here is to obtain
a bindingTemplate to connect to the particular server which provides the service.
Based on the details of the specification provided by the bindingTemplate , the
company sets up its program to interact with the particular web service. The
semantics of the service is obtained by accessing the tModel contained in the
bindingTemplate for the service.
At runtime, the program invokes the webservice based on the connection details
provided in the bindingTemplate.
If the required interface connections as specified in the tModel exists calls to the
remote service will be successful. On the other hand, if there is a problem with the interaction
between client and Web service, UDDI specifies details for failure and recovery.
It is important for clients to detect and recover from failures that occur during interaction
with the remote partners. UDDI caches the calling convention and bindingTemplates. This
cached information is refreshed based on the current information from a UDDI web registry
when a failure occurs. The following sequence of steps indicates how error recovery fits
into web services:
The program developed by a programmer to use a web service also contains
caching the appropriate bindingTemplate for the use at runtime.
At the time of executing the program, the cached bindingTemplate that was earlier
obtained from the UDDI Web registry is utilized by the program.
In case of any failure, a new copy of the bindingTemplate for this unique web
service is obtained through the bindingKey value and the get_bindingTemplate
API call.
The program compares the new bindingTemplate information with the cached
version. If they are different, the program retries the failed call using the new
bindingTemplate.
The approach “retry on failure” is followed in case of same also. Hence, the
NOTES program(client) retries the call. This approach proved more efficient than acquiring
a new copy of bindingTemplate data. When a business needs to redirect the traffic
to a new site this approach proves more useful. It needs only activate the new site
and change the published location information for the affected bindingTemplates.
4.3.2 UDDI tModel
What exactly is tModel? Why do we need it? How do we use it? It turns out the
tModel concept is just like the XML namespace concept: it is not complex at all, yet it can
be very confusing.
tModel Is Used to Represent Interfaces:
UDDI is an online “yellow book” that is used by both the service providers and
service consumers. The service providers will register their Web services into UDDI, and
service consumers will try to find the service descriptions from this online registry which
will finally lead to the services that they desire. The idea of “interface” in the world of
UDDI is more or less similar to the concept of interface in the world of COM/DCOM,
i.e., it is the “contract” that both the service provider and the service consumer will honor:
the service provider promises to implement the Web service in such a way that if the
consumer invokes the service by following this contract, his/her application will get what it
expects.
Notice that the interface a Web service implements may or may not be defined by this
service provider. For instance, some major airlines may get together and form a committee
which will work out and publish (register) an interface in UDDI for querying the ticket
price on a given date, time, and city pairs. This published interface will become the industrial
standard, and the implementation work is left to be done by each specific airline. Each
airline will then develop a Web service that implements this interface and also register the
service with UDDI. In this case, the interface is not defined by the airline which implements
it. Also, it is quite obvious that the life of a travel agent is now quite easy: although we have
quite a few different airlines, there is only one querying interface he/she needs to worry
about.
Now the question is that the Web service a given provider wants to register has no
standard interface at all, in which case, the provider will have to first create and register an
interface with UDDI. After this interface is registered, the service that implements it can
then be developed and registered.
In what kind of “language” is the interface described? The answer gives the first big
role of tModel: every single interface in UDDI is represented by a tModel.
An example seems to be appropriate at this point. Let us say that we want to create
a Web service for CodeProject.com which will accept a String representing a person’s
name, and will return a non-negative integer indicating how many articles this person has
submitted to CodeProject.com. This seems to be a fairly “special” service, so we assume
there is no current “standard” for this service, i.e., there is no existing interface we can
register our service against. Therefore, we need to create our own interface first. NOTES
So, what we can we think about UDDI? As a whole it can be described as a Project.
This Universal Description, Discovery, and Integration (UDDI) Project provides a
standardized method for publishing and discovering information about web services. It is
an industry initiative that attempts to create a platform-independent, open framework for
describing services, discovering businesses, and integrating business services. With this let
us explore what WSDL is?
4.4 WSDL
WSDL is an XML format for describing how one software system can connect and
utilize the services of another software system over the internet. WSDL is an altogether
different being, offering a degree of extensibility. This extensibility allows WSDL to be
used to:
Describe endpoints and their messages, regardless of the message format or network
protocol used to exchange them.
Treat messages as abstract descriptions of the data being exchanged.
Treat port types as abstract collections of web services’ operations. A port type
can then be mapped to a concrete protocol and data format.
If you are feeling not comfortable with these items, don’t worry. We will see fewer
“scientific” definitions as we go along; don’t let the terms scare you away from this
technology. Are you ready! Let’s start.
4.4.1 What Is WSDL?
Right now the need of the hour is finding a standard way of describing two machines
to interact with each other? Do you agree? Why this is important? Since because the
number of communication formats and protocols used on the Internet continues to increase.
Here is simple way WSDL provides: WSDL describes
What a service does?
How to invoke its operations?
Where to find it?
Hence WSDL has created separate definitions and terminology for defining a web
service, the communication endpoint where that web service exists, the legal format for
input and output messages for the web service, and an abstract way to declare a binding to
a concrete protocol and data format. But everything defined within a WSDL file is abstract:
it’s just the definition of parameters and constraints for how communication should occur
at runtime. Then who’s responsibility is to provide the exact service implementation
specifications? Have this question in your mind!!!
The web service implementation has to adhere to the guidelines defined in the WSDL
file but has some flexibility over specifics. WSDL also provides the ability to define a
binding that attaches an abstract set of message definitions to a concrete protocol or data
NOTES format. A bindingextension is a type of binding defined for a major protocol. WSDL
defines out-of-the-box binding extensions for SOAP 1.1, HTTP GET, HTTP POST, and
MIME.
WSDL defines services as collections of network endpoints or ports. Here the abstract
definition is separated from their concrete network based data bindings. The following
fig. explains about the information that a WSDL file should contain to use it as web service.
Now let us see the individual parts of a WSDL document. To make the segment
simpler the elements from WSDL binding extensions (i.e., SOAP, HTTP, etc.) were not
included. The following segment describes about the major elements that may appear in a
WSDL document. The asterisk (*) specifies that more than one of the indicated elements
may appear.
<definitions>
<import>*
<types>S
<schema></schema>*
</types>
<message>*
<part></part>*
</message>
<PortType>*
<operation>* NOTES
<input></input>
<output></output>
<fault></fault>*
</operation>
</PortType>
<binding>*
<operation>*
<input></input>
<output></output>
</operation>
</binding>
<service>*
<port></port>*
</service>
</definitions>
The description of the service is described between the <definitions> and </definitions>
elements in a WSDL document. Actually the global declarations of namespaces will be
defined in this place that is intended to be visible throughout the rest of the document.
Could you recall about the XML namespace? Just to refresh about namespaces! It is
nothing but a name that qualifies element and attribute names.
A namespace provides an alias (code name) to use within the current XML document
for referring to the rules defined in a separate XML Schema document. In other words, it
is used as a qualifier for tags/elements. For example, if two XML Schema documents each
define the <name> tag with different subelements, how would an XML file that uses both
schemas know which <name> definition to refer to? The namespace alias is used as a
prefix to qualify an XML tag as coming from a particular XML Schema document. Is it
okay? Let us move on to the other elements in the service description.
of WSDL documents and creates an environment of reuse that can create clear service
NOTES definitions. What is the advantage having this element? It allows us to have WSDL documents
structured. In this way they are easier to use and maintain. But it requires any WSDL
parsing engine to perform additional I/O operations to import any externally referenced
resource.
For example look into the following statement containing the information about the
address where to find the name space?
<importnamespace=http://abcd.efgh.net/xer location=”http://abcd.efgh.net/xer/
ij.xsd”/>
Thus the <import> element imports the namespace of another file, not the file
itself. When an <import> statement is used, all elements for that given namespace are
included at the location of the <import> element in the parent document.
<types> Element
The <types> element in a WSDL document acts as a container for defining the
data types used in <message> elements. The use of the <message> element is to define the
format of messages interchanged between a client and a web service. Currently, XML
Schema Definitions (XSD) is the most widely used data typing method. The other typing
approaches are also acceptable by the WSDL. Generally, a study of the <types> element
is a study about the XML Schema.
The <types> element can have zero or more <schema> subelements, which must
follow the rules for XML Schema documents. Here the advantage is a <types> element
need not be included directly. Fairly, the schema definitions required for the WSDL document
may be included via the <import> element. Here the WSDL parser automatically
understands that the included elements must be specified as part of the <types> definition.
<message> Element
The data to be communicated or exchanged as part of web service has to be
modeled. The data contained within a <message> element typed by a <message> element
is abstract. Here the message may be a single message or can be divided into sub messages.
If they are to be represented as sub messages then the usage of <part> tag comes into
picture. A <part> subelement identifies the individual pieces of data that are part of this
data message and the datatypes that the pieces adhere to. Look into the following piece of
code containing <message> element and <part> element:
<message name=”s_Header”>
<part type=”xsd:string” name=”id”/>
<part type=”xsd:string” name=”timeout”/>
</message>
116 ANNA UNIVERSITY CHENNAI
XML AND WEB SERVICES
In the previous code, the <message> element is uniquely identified by the name attribute.
This message is named s_Header; it has two <part> subelements, of which the first is NOTES
named id and the second is named timeout. Here, each part is typed as an XML Schema
string (xsd:string). But the types used in part definitions aren’t required to come from XML
Schema; they could just as well be defined in the <types> element of the existing WSDL
document itself.
<portType> Element
A subset of operations supported for an endpoint of a web service is being defined
using the <portType> element. Is it clear? In other words it can be defined as a unique
identifier which allows us to specify a group of actions to be executed at a single point.
Then how to define the operations? It is possible using the other element named
<operation>. Hence the <operation> element represents an operation. This element is an
abstract definition of an action supported by a web service. A WSDL operation can have
input and output messages as part of its action. The <operation> tag defines the name of
the action by using a name attribute, defines the input message by the <input> subelement,
and defines the output message by the <output> subelement. The <input> and <output>
elements reference <message> elements defined in the same WSDL document or an
imported one. A <message> element can represent a request, response, or a fault.
<portType name=”abce1111PortTypes”>
This element declares that this endpoint has a set of operations that are jointly referenced
as abce1111PortTypes. The following lines define the <operation> elements for this
<portType>:
<!— Request-response Operations (client initiated) —>
<operation name=”init”>
<input message=”initRequest”/>
<output message=”initResponse”/>
</operation>
<operation name=”search”>
<input message=”searchRequest”/>
<output message=”searchResponse”/>
</operation>
According to the behavior, these operations are grouped. When an operation is
defined in a WSDL document, it is made to be abstract; it is purely an operation definition,
but how that operation is mapped to a real function is defined later (i.e., the operation can
behave in a number of different ways depending on the actual definition). The WSDL
specification defines the following behavioral patterns as transmission primitives:
Request-response
NOTES Solicit-response
One-way
Notification
First, the operation can follow a request-response model, in which a web service
client invokes a request and expects to receive a synchronous response message. This
model is defined by the presence of both <input> and <output> elements. The <input>
element must appear before the <output> element. This order indicates that the operation
first accepts an input message (request) and then sends an output message (response).
This model is similar to a normal procedure call, in which the calling method blocks until
the called method returns its result.
Second, the operation can follow a solicit-response model, in which the web service
solicits a response from the client, expecting to receive a response. This model is defined
as having both <input> and <output> elements. The <output> element must appear before
the <input> element. This order indicates that the operation first sends an output message
(solicit) and then receives an input message (response).
Third, the operation can be a one-way invocation, in which the web service client
sends a message to the web service without expecting to receive a response. This model is
defined by a single <input> message with no <output> message. This model indicates that
the operation receives input messages (one-way invocation), but doesn’t deliver a response
to the client.
Fourth, the operation can be a notification, in which the web services send a one-way
message to the client without expecting a response. This model is defined by a single
<output> message and no <input> message. It indicates that the operation sends
output messages asynchronously; i.e., the messages are not in response to a request, but
can be sent at any time. The operation doesn’t expect a response to the messages it sends.
The value assigned to the name attribute of each <operation> element must be unique
within the scope of the <portType>. The names of the input and output messages must be
unique within the <portType>, not just the <operation>. The value assigned to the message
attribute of an <input> or <output> element must match one of the names of the <message>
elements defined in the same WSDL document or in an imported one.
<binding> Element
A <binding> element is a concrete protocol and data format specification for a
<portType> element. It is where you would use one of the standard binding extensions—
HTTP, SOAP, or MIME—or create one of your own. Each protocol has its own wire
format. For example, HTTP has a simple header/body format. SOAP, which can exist
inside of HTTP and other protocols, has its own header and body. A SOAP message can
have attachments included as part of a message. NOTES
The WSDL document has already defined the <operation> elements for this web
service. A<binding> element takes the abstract definition of the operations and their input/
output messages and maps them to the concrete protocol that the web service uses. Should
the <input> element defined in a WSDL document be located in the SOAP header? Should
it be in the SOAP body? Should it be in the attachment? Also, how should the data should
be encoded? Should the supplied schema be used for encoding rules or should literal
encoding be used? The answer is the <binding> element provides this mapping.
This portion illustrates a Web Service example which creates a web service that
converts the temperature from Farenheit to Celsius and Celsius to Farenheit using ASP.NET.
(www.w3schools.com/webservices).
The general assumption is that any application can have a Web Service component.
Web Services can be created regardless of programming language. This document is saved
as an .asmx file. This is the ASP.NET file extension for XML Web Services.
Explanation
The following line in the example states that this is a Web Service, written in VBScript,
and has the class name “TempConvert” which is very clear from the code.
If we are familiar with .NET framework it is very easily understood from the following
lines is that it imports the namespace “System.Web.Services” from the .NET framework.
Imports System
Imports System.Web.Services
The next line defines that the “TempConvert” class is a WebService class type:
Then comes the VB programming. This application has two functions. One to convert
from Fahrenheit to Celsius, and one to convert from Celsius to Fahrenheit.
The only difference from a normal application is that this function is defined as a
“WebMethod()”.
Look into the WebMethod()” in the code sequence to understand the same.
NOTES
Then, end the class:
end class
Publish this .asmx file on a server with .NET support. Yes now you have your first
working example of a web service. Now you may ask the question where that WSDL and
SOAP documents. Don’t be panic. With ASP.NET, you do not have to write your own
WSDL and SOAP documents. ASP.NET has automatically created a WSDL and SOAP
request.
Host: www.w3schools.com
Content-Length: length
<?xml version=”1.0" encoding=”utf-8"?>
<soap12:Envelopexmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
HTTP/1.1 200 OK
Content-Length: length
<?xml version=”1.0" encoding=”utf-8"?>
<soap12:Envelopexmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:so ap12=”htt p://
www.w3.org/2003/05/soap-envelope”>
<soap12:Body>
<CelsiusToFahrenheitResponse xmlns=”http://tempuri.org/”>
NOTES
<CelsiusToFahrenheitResult>string</CelsiusToFahrenheitResult>
</CelsiusToFahrenheitResponse>
</soap12:Body>
</soap12:Envelope>
From this example, it is very clear about the structure of a SOAP 1.2 segment. This
code segment illustrates only Celsius to Farenheit conversion. The Farenheit to Celsius
code can be tried on your own.
Look into the automatically created XML file having WSDL code.
<wsdl:definitions targetNamespace=”http://tempuri.org/”>
<wsdl:types>
<s:schemaelementFormDefault=”qualified”
targetNamespace=”http://tempuri.org/”>
<s:element name=”FahrenheitToCelsius”>
<s:complexType>
<s:sequence>
<s:element minOccurs=”0" maxOccurs=”1" name=”Fahrenheit”
type=”s:string”/>
</s:sequence>
</s:complexType>
</s:element>
<s:element name=”FahrenheitToCelsiusResponse”>
<s:complexType>
<s:sequence>
<s:element minOccurs=”0" maxOccurs=”1"
name=”FahrenheitToCelsiusResult” type=”s:string”/>
</s:sequence>
</s:complexType>
</s:element>
<s:element name=”CelsiusToFahrenheit”>
<s:complexType>
<s:sequence>
<s:element minOccurs=”0" maxOccurs=”1" name=”Celsius”
type=”s:string”/>
NOTES </s:sequence>
</s:complexType>
</s:element>
<s:element name=”CelsiusToFahrenheitResponse”>
<s:complexType>
<s:sequence>
<s:element minOccurs=”0" maxOccurs=”1"
name=”CelsiusToFahrenheitResult” type=”s:string”/>
</s:sequence>
</s:complexType>
</s:element>
<s:element name=”string” nillable=”true” type=”s:string”/>
</s:schema>
</wsdl:types>
<wsdl:message name=”FahrenheitToCelsiusSoapIn”>
<wsdl:part name=”parameters” element=”tns:FahrenheitToCelsius”/>
</wsdl:message>
<wsdl:message name=”FahrenheitToCelsiusSoapOut”>
<wsdl:part name=”parameters” element=”tns:FahrenheitToCelsiusResponse”/>
</wsdl:message>
<wsdl:message name=”CelsiusToFahrenheitSoapIn”>
<wsdl:part name=”parameters” element=”tns:CelsiusToFahrenheit”/>
</wsdl:message>
<wsdl:message name=”CelsiusToFahrenheitSoapOut”>
<wsdl:part name=”parameters” element=”tns:CelsiusToFahrenheitResponse”/>
</wsdl:message>
<wsdl:message name=”FahrenheitToCelsiusHttpPostIn”>
<wsdl:part name=”Fahrenheit” type=”s:string”/>
</wsdl:message>
<wsdl:message name=”FahrenheitToCelsiusHttpPostOut”>
<wsdl:part name=”Body” element=”tns:string”/>
</wsdl:message>
<wsdl:message name=”CelsiusToFahrenheitHttpPostIn”>
<wsdl:part name=”Celsius” type=”s:string”/>
</wsdl:message>
<wsdl:message name=”CelsiusToFahrenheitHttpPostOut”> NOTES
<wsdl:part name=”Body” element=”tns:string”/>
</wsdl:message>
<wsdl:portType name=”TempConvertSoap”>
<wsdl:operation name=”FahrenheitToCelsius”>
<wsdl:input message=”tns:FahrenheitToCelsiusSoapIn”/>
<wsdl:output message=”tns:FahrenheitToCelsiusSoapOut”/>
</wsdl:operation>
“
<wsdl:operation name=”CelsiusToFahrenheit”>
<wsdl:input message=”tns:CelsiusToFahrenheitSoapIn”/>
<wsdl:output message=”tns:CelsiusToFahrenheitSoapOut”/>
</wsdl:operation>
</wsdl:portType>
“
<wsdl:portType name=”TempConvertHttpPost”>
“
<wsdl:operation name=”FahrenheitToCelsius”>
<wsdl:input message=”tns:FahrenheitToCelsiusHttpPostIn”/>
<wsdl:output message=”tns:FahrenheitToCelsiusHttpPostOut”/>
</wsdl:operation>
“
<wsdl:operation name=”CelsiusToFahrenheit”>
<wsdl:input message=”tns:CelsiusToFahrenheitHttpPostIn”/>
<wsdl:output message=”tns:CelsiusToFahrenheitHttpPostOut”/>
</wsdl:operation>
</wsdl:portType>
<wsdl:binding name=”TempConvertSoap” type=”tns:TempConvertSoap”>
<soap:binding transport=”http://schemas.xmlsoap.org/soap/http”/>
<wsdl:operation name=”FahrenheitToCelsius”>
<soap:operation soapAction=”http://tempuri.org/FahrenheitToCelsius”
style=”document”/>
<wsdl:input>
<soap:body use=”literal”/>
</wsdl:input>
NOTES <wsdl:output>
<soap:body use=”literal”/>
</wsdl:output>
</wsdl:operation>
“
<wsdl:operation name=”CelsiusToFahrenheit”>
<soap:operation soapAction=”http://tempuri.org/CelsiusToFahrenheit”
style=”document”/>
<wsdl:input>
<soap:body use=”literal”/>
</wsdl:input>
“
<wsdl:output>
<soap:body use=”literal”/>
</wsdl:output>
</wsdl:operation>
</wsdl:binding>
<wsdl:binding name=”TempConvertSoap12" type=”tns:TempConvertSoap”>
<soap12:binding transport=”http://schemas.xmlsoap.org/soap/http”/>
<wsdl:operation name=”FahrenheitToCelsius”>
<soap12:operation soapAction=”http://tempuri.org/FahrenheitToCelsius”
style=”document”/>
<wsdl:input>
<soap12:body use=”literal”/>
</wsdl:input>
<wsdl:output>
<soap12:body use=”literal”/>
</wsdl:output>
</wsdl:operation>
“
<wsdl:operation name=”CelsiusToFahrenheit”>
<soap12:operation soapAction=”http://tempuri.org/CelsiusToFahrenheit”
style=”document”/>
<wsdl:input>
<soap12:body use=”literal”/>
</wsdl:input> NOTES
<wsdl:output>
<soap12:body use=”literal”/>
</wsdl:output>
</wsdl:operation>
</wsdl:binding>
<wsdl:binding name=”TempConvertHttpPost”
type=”tns:TempConvertHttpPost”>
<http:binding verb=”POST”/>
“
<wsdl:operation name=”FahrenheitToCelsius”>
<http:operation location=”/FahrenheitToCelsius”/>
“
<wsdl:input>
<mime:content type=”application/x-www-form-urlencoded”/>
</wsdl:input>
“
<wsdl:output>
<mime:mimeXml part=”Body”/>
</wsdl:output>
</wsdl:operation>
“
<wsdl:operation name=”CelsiusToFahrenheit”>
<http:operation location=”/CelsiusToFahrenheit”/>
“
<wsdl:input>
<mime:content type=”application/x-www-form-urlencoded”/>
</wsdl:input>
“
<wsdl:output>
<mime:mimeXml part=”Body”/>
</wsdl:output>
</wsdl:operation>
</wsdl:binding>
“
125 ANNA UNIVERSITY CHENNAI
DMC 1801
<wsdl:service name=”TempConvert”>
NOTES <wsdl:port name=”TempConvertSoap” binding=”tns:TempConvertSoap”>
<soap:address
location=”http://www.w3schools.com/webservices/tempconvert.asmx”/>
</wsdl:port>
<wsdl:port name=”TempConvertSoap12"
binding=”tns:TempConvertSoap12">
<soap12:address
location=”http://www.w3schools.com/webservices/tempconvert.asmx”/>
</wsdl:port>
<wsdl:port name=”TempConvertHttpPost” binding=”tns:TempConvertHttpPost”>
<http:address
location=”http://www.w3schools.com/webservices/tempconvert.asmx”/>
</wsdl:port>
</wsdl:service>
</wsdl:definitions>
This code clearly depicts the WSDL and SOAP message to create a simple web
service to convert Celsius to Farenheit and vice versa.
4.4.5 Web Services stack
When we talk about web services stack, the two important items to be described are
web services protocol stack and web services stack. A web service protocol stack
(Wikipedia definition) is a stack of computer networking protocols that are used to define,
locate, implement, and make Web services interact with each other. A web service protocol
stack typically stacks four types of protocols:
(Service) Transport Protocol: This is responsible for transporting messages
between network applications and includes protocols such as HTTP, SMTP, FTP,
as well as the more recent Blocks Extensible Exchange Protocol (BEEP).
(XML) Messaging Protocol: This protocol is responsible for encoding messages
in a common XML format so that they can be understood at either end of a network
connection. Currently, this area includes such protocols as XML-RPC, WS-
Addressing, and SOAP.
(Service) Description Protocol: This protocol is used for describing the public
interface to a specific web service. The WSDL interface format is typically used
for this purpose.
(Service) Discovery Protocol: This protocol centralizes services into a common
registry such that network web services can publish their location and description,
and makes it easy to discover what services are available on the network. At
present, the UDDI API is normally used for service discovery.
The web service protocol stack also includes a whole range of recently defined
protocols such as BPEL, SOAP-DSIG. NOTES
On the other hand a Web services stack is a rather limited thing. It is software that
supports the Web services standards so you can send and receive SOAP messages and
do the UDDI and WSDL stuff. Here the ultimate aim is to make every one (vendors like
IBM) to agree on a single Web services stack — the protocols used to define, locate,
implement and make Web services interact. We have studied about J2EE and .NET
framework in the previous section. Now the question is interoperability between the
technologies? This web services stack is a solution towards it.
Metro
The Metro Web Services stack delivers secure, reliable, transactional interoperability
between Java EE and .Net 3.0 to help you build, deploy, and maintain Composite
Applications for your Service Oriented Architecture. Metro provides ease-of-development
features, support for W3C and WS-I standards such as SOAP and WSDL, asynchronous
client and server, and data binding through JAXB 2.0.
GlassFish
Web services are Web based applications that use open, XML-based standards and
transport protocols to exchange data with clients. Web services are developed using Java
Technology APIs and tools provided by an integrated Web Services Stack called Metro.
The Metro stack consisting of JAX-WS, JAXB, and WSIT, enable you to create and
deploy secure, reliable, transactional, interoperable Web services and clients. The Metro
stack is part of Project Metro and as part of GlassFish, Java Platform, Enterprise Edition
(Java EE), and partially in Java PlatForm, Standard Edition (Java SE). GlassFish and Java
EE also support the legacy JAX-RPC APIs.
Axis 2.0 runs on WebSphere, as well as WebLogic from BEA Systems Inc., and
Apache’s own Tomcat, and has demonstrated interoperability with Microsoft .NET
framework. The BEA and JBoss, the division of Red Hat Inc., have chosen to develop
their own Web services stacks. BEA offers SALT 1.1, a native Tuxedo Web service stack
built on an open-standard SOAP implementation. JBossWS is a JAX-WS compliant Web
services stack developed to be part of JBoss’ Java EE5 support. It is nice to have a single
stack that runs on WebSphere, Tomcat and WebLogic.
WSBL
NOTES
Web Service Business Library (WSBL) is a solution for any company which offers
financial services, by combining agents’ theory, web services and grid computing. This
approach would enable the bank to have only one library for pricing all products running in
one grid giving service to all of the trading rooms that a bank could have around the world.
These services could be sold to third-party users with the appropriate security services.
4.5 EBXML
The central point of the web services architecture is based on the repositories which
allow businesses to find each other and utilize the services provided by each other. This
method of finding the required services through web from a centralized information source
is an effective way to find the required services by the businesses. However, there are
some approaches available than this to achieve the same. One of the approaches is ebXML
which is nothing but Electronic Business XML. This represents a global initiative to define
processes around which business can interact over the web. Hence, the vision of ebXML
is to create a single global electronic marketplace where enterprises of any size and in any
geographical location can meet and conduct business with each other through the exchange
of XML based messages.To facilitate this, ebXML provides an infrastructure for
data communication interoperability,
a semantic framework for commercial interoperability
a mechanism that allows enterprises to find, establish a relationship, and conduct
business with each other.
The brains behind this project are UN/CEFACT (United Nations Center for Trade
Facilitation and Electronic Business) and OASIS (the Organization for the Advancement
of Structured Information Standards). The Wikipedia definition for ebXML is as follows:
The original project envisioned five layers of data specification, including XMLstandards
for:
Business processes,
Collaboration protocol agreements,
Core data components,
Messaging,
Registries and repositories
This initiative continued to gain support from variety of sources and other standards
organizations. Some of the oranizations are RosettaNet( a consortium of more than 400
companies), The Global Commerce Initiative (representing manufacturers and retailers),
The open Applications Group Inc, the Automotive Industry action group, Health Level
Seven and the Open Travel Alliance.
This initiative is based on the set of building blocks that’s makes use of existing standards
where ever possible. ebXML TechnicalArchitecture is comprised of two basic components:
Design Time and Run Time. Business Process and Business Information Analysis is a part of
Design Time component. The Design Time component deals with the procedures for creating
an application of the ebXML infrastructure, as well as the actual discovery and enablement of
ebXML-related resources required for business transactions to take place. The Run Time
component covers the execution of an ebXML scenario with the actual associated ebXML
transactions.
The following are the some of the components of the technical architecture:
Messaging
Business Process
ebXML distinquishes itself from other XML frameworks by the emphasis given to the
business process. The overall process includes
Process Definition
NOTES utilizing Business Process and Business Document Analysis
logical progress to Partner Discovery
Partner Sign-Up
Electronic Plug-in
Process Execution
Process Management
Process Evolution
Here the modeling languages and charting tools are used to standardize and capture the
flow of business data among the trading partners.
Here each trading partner will have their own Collaboration Protocol Profile (CPP)
document that describes their abilities in an XML format. For example, it may include the
messaging protocols they support, or the security capabilities they support. A CPA document
is the intersection of two CPA documents, and describes the formal relationship between
two parties. The following information will typically be contained in a CPA document:
Identification information
Security information
Communication information
Endpoint locations
Rules to follow when acknowledgments are not received for messages, including
how long to wait before resending, and how many times to resent
Whether duplicate messages should be ignored
Whether acknowledgments are required for all messages
Registries and repositiories
ISO 15000-4 is the standard for ebXML Registry Services Specification. It contains
the industry processes, messages and vocabularies that define the transactions that occur
between trading partners.
Core Components
The ebXML Methodology for the Discovery and Analysis of Core Components
describes the process for identifying information components that are re-usable across
industries. Core components are used to define domain components and business
information objects. Business libraries, which contain libraries of business process
specifications, are instrumental in the discovery and analysis of core components and domain
components.
Thus web services can be looked into as a bundle, which allows us to take the web
from content delivery network for server to server interaction. In detail, we explored about
UDDI for registering, storing and WSDL for figuring out how to connect to existing services.
Also we saw how SOAP makes it possible a decentralized, distributed space made possible.
Also explained about ebXML in detail
So far we have studied about various protocols and standards for Web service
implementations such as SOAP, UDDI and WSDL. What are the functionalities and facilities
they provide? They facilitate the transporting of services, discovery of services and
establishment of the connections. Even though it looks it is enough, it is not so? Then what
else is required? These protocols do not provide any functionality for the critical requirements
for the electronic enterprise as mentioned below:
Transactions
Security
Identity
Now it is important to understand how far it will cater to the dynamic needs of the
industry? Is there any web service battle lines are shaping up along two fronts Microsoft
with its .NET initiative and Sun with J2EE architecture? In this section let us look into the
Web-services related strategies of both .NET and J2EE.
Generally, the interaction between the servers may be based on any protocol available.
NOTES The traditional enterprise computing model, based on middleware and application servers
tied to tightly coupled networks. But the introduction of loosely coupled message-based
architectures has changed the computing landscape of server to server interactions.
However, making this loosely coupled Web space commercially viable for service based
interaction requires transactional capabilities to ensure the following:
Stability and regularity across networks
Security to protect transactions
Managing the identity in open networks
The above points clearly indicate the significance to be provided for the transactions,
security and identity in the emerging world of SOAP and Web Services for the success of
new web environment. Let us look into the transactions, security and identity in detail in
the next section.
Transactions
Transactions are the set of software operations which are the basic units of electronic
commerce venture. They should posses the properties Atomicity, Consistency, Isolation
and Durability also known as ACID properties.
Consistency refers that the transaction should preserve the consistent state of the
data while performing the operations.
Durability means that the updates made by committed transactions be exist in the
database in spite of failures which occur even after the commit operation. This means that
the data changes are recoverable even after any failure or crash.
Hence transactions are vital for any architecture to handle web based e-commerce
applications. This leads to the software vendors to concentrate on the development of the
software for transaction monitors, for standard interface to a variety of back-end databases
etc.
Security
NOTES
The internet relies on several security protocols. For web based e-commerce verifying
the authentication of the web sites, encryption of the data to be transferred are some of the
security measures required. Even though the Secure Sockets Layer and Transport Layer
Security protocols have been successful in achieving this, it is not enough. This has been
explained in detail in the V Unit.
Identity
Now a day, in the web environment, the user identity is the center of attraction. Here
the focus is changed from the machine towards the user. Initially the machine details were
used as key for licensing and installing the software in that machine. Without such licenses,
the software could not be installed, or if installed, it would run illegitimately. However,
imagine a situation where hardware needs to be validated not the user. When dealing with
the users connecting via web, the user may be available in any geographical location. In
this environment, the user will not be in a position use the licensed software unless otherwise
he carries the hardware along with him. Hence a new model is required where the user
authentication becomes a key issue. Here the prime question is whether the software package
is licensed to run for a particular user? The achievement of this is through validating the
user based on permissions stored in some database to determine what the user can and
can’t do.
Let us discuss about the two technologies Microsoft’s Passport and Sun supported
Liberty alliance for managing user identity.
Passport
How Microsoft handles the user identity through the facility called Passport? Passport
can store credit card and address information as part of user’s account. It is also used as
an entry point to the .NET My Services which is a way to utilize the Web services to
consumer applications. Now the question is if the user has a passport what are the facilities
available for him and they are going to be useful in how many ways to the user? The only
answer is, with access to passport the user can
Participate in express purchasing over the web without manually entering their
address information and payment information
Liberty alliance
NOTES
This is an alternative technology to Passport promoted by Sun systems for single-
sign-on authentication service. Here the objective is to create a universal digital identity
service based on open standards. Hence users could be able to log in once on a given web
site and will be an authenticated user for all online services supporting the Liberty standard.
On one side the loosely coupled environment is available. On the other side the tightly
coupled object based environments are available. The technologies on hand to work on
these two environments are enormous. Now the challenge is bringing the relationship
between these two environments to bridge the gap. For example, assume that the transaction
engines are running under tightly coupled networks. The SOAP based data is available
across the web space which is based on loosely coupled systems. Now the question is
bringing the transactional integrity between the web with its promise of global connectivity
and more conventional middleware that holds the key to transactions, security and identity.
We can say that the tightly coupled object-based frameworks have been subsumed by
.NET and J2EE. While these two technologies are often compared with each other, they
have elemental differences that make direct comparison difficult. Even this is evident; the
following points can be noted to make the points clear:
Let us see the implementations in detail in the following section in J2EE and .NET
platforms.
The .NET initiative focuses on a development framework that integrates all the earlier
Microsoft technologies with newer technologies built around XML. What can we do with
this .NET framework? Here is the answer:
NET consists
development tools
run-time environments
server infrastructure
Intelligent software to build applications for various platforms and devices
.NET integrates various applications and devices by using standards
Hypertext Transfer Protocol (HTTP)
XML
Simple Object Access Protocol (SOAP)
The tools .NET provides are
Smart Client software
Using XML web services, a client, a PC, or a mobile device can access data from
any location or device
Complete solution for building, hosting, and consuming XML Web services is been
provided by Visual Studio .NET and the .NET Framework. Also Visual Studio .NET
support a variety of programming environments and languages. In addition, it provides a
single-point access to all the tools that we require thus making it one of the most productive
tools available.
XML Web services comprise the core components that enable a client application to
NOTES exchange data with another client or server application, which is shown in the figure 4.4.
Server applications can also exchange data with each another with the help of Web services.
Also, applications running on any device can exchange data with the applications running
on any other device.
Desktop
Computers
Mobile devices
Servers
Win 32
Create a class and define the functionality of the application in terms of properties,
events, and methods of the class.
For the Web applications, the code that controls the behavior of the Web page is
encapsulated within a class.
Hence the classes support object-oriented features such as inheritance,
encapsulation, and polymorphism. Therefore, classes are fundamental to
programming in the .NET environment.
Classes can be created in any language supported by the .NET Framework.
A class written in one language is reusable by classes written in other languages.
Classes inherit across language boundaries because the .NET Framework allows
language interoperability and supports cross-language inheritance
This includes the CLI and provides the execution environment to .NET applications.
This component provides the necessary data types, value and object types to develop
applications in different languages. All the .NET languages share a Common Type System.
For example, a String in Visual Basic .NET is the same as a String in Visual C# or
in Visual C++ .NET, since all the .NET languages have access to the same class
libraries.
Type safety.
Side-by-side execution.
Here an entity called “assembly” is used; it contains the IL code and metadata. The
metadata contains information such as the version of the assembly and the name and version
of the other assemblies on which the assembly depends. What is the use of this “assembly”?
The .NET Framework allows you to deploy multiple versions of an application on a system
by using assemblies. Assemblies are the deployment units in the .NET Framework.. The
common language runtime uses the version information in the metadata to determine
application dependencies and enables you to execute multiple versions of an application
side-by-side.
4.6.3 J2EE
J2EE is the java centric enterprise platform specification. Even though J2EE originated
with Sun, the complete specification and changes in the specification are under the
collaborative umbrella of the Java Community Process. Here in J2EE, the Web services
use standards-based frameworks to extend an application’s reach. However, a web service
isn’t the application itself. The web service must still be implemented on a proven application
infrastructure—one that supports reliability, availability, serviceability, transactions, security,
and other critical enterprise needs which the J2EE infrastructure provides. It includes the
following API’s.
The Java API for XML Messaging (JAXM) and the Java API for XML-based RPC
(JAX-RPC) are both part of the Java Web Services Developer Pack. What are the uses
of these API’s? These APIs are a key part of Sun’s plans to integrate web services
interfaces into future versions of the J2EE platform.
JAXM is a method available, which gives ways to define a common set of Java APIs
for creating, consuming, and exchanging SOAP envelopes over various transport NOTES
mechanisms. It is mainly used for a document-style exchange of information. It requires the
use of low-level APIs to manipulate the SOAP envelope directly.
We can simply JAX-RPC provides a way for performing RMI-like Remote Procedure
Calls over SOAP. It also facilitates rules for such things as client code generation, SOAP
bindings, WSDL-to-Java and Java-to-WSDL mappings, and data mappings between Java
and SOAP.
It is clear that we know SOAP is the basis of interoperability between J2EE and
web services. The understanding of how J2EE and web services work together comes
down to analyzing how SOAP and J2EE can work together.
SOAP is a wire protocol that can be layered upon other wire protocols such as
HTTP, FTP, and SMTP
J2EE supports these Internet protocols through servlets
Hence, servlets and JSP technology will become the entry point into a J2EE
framework for web services
Within J2EE, servlets, JSPs, EJBs, JMS resources, JDBC drivers, and J2EE CA
adapters provide access to the business logic and enterprise resources that a web
service needs
Servlets and JSPs are designed to encapsulate page-based flow and logic and
can also work with numerous Internet protocols
The servlets are responsible for extracting the SOAP contents from another wire
NOTES packet.
The SOAP contents must then be parsed so the servlet can acquire access to the
elements and attributes contained within the SOAP document.
Based upon how WSDL, JAXM, and JAX-RPC eventually define the behavior of
web services, four fundamental types of messages can be transported over SOAP:
Request/response
Solicit/response
One-way
Notification
These four types of behaviors have been already explained. In this way, the integration
between web services and J2EE works.
A newly proposed standard called the Java web service ( JWS) standard is currently
in development. It is spearheaded by BEA Systems, which also has a reference
implementation. What is it? It is nothing but a format designed to integrate non-Java
developers with J2EE. Isn’t it Sounds ambitious? At the core of the JWS specification, the
idea is that the developers don’t create J2EE components. Rather, developers create a
web service, and a single Java class which contains the code for the web service intended
to. The Java class then has a number of simple, predefined JavaDoc tags that indicate
different behavioral implementations of the web service. Based on the values of the JavaDoc
tags inserted into the Java class, a behind-the-scenes code generator then creates all
necessary J2EE components required to implement the web service. NOTES
The JWS JavaDoc system has tags representing a full range of web service behaviors,
including stateless methods, stateful methods, and asynchronous invocations. The challenge
left to JWS implementations is to take the definition of the JavaDoc tags and generate
J2EE components that implement this behavior in a reliable and available manner.
Since it is interesting and appealing the tool vendors can support BEA’s prototype
implementation. Also it comes with a nice IDE that ties together design, coding, and testing.
The concept of deployment is completely hidden from the developer. The goal is to have a
framework for developing web services with J2EE that is similar to working in Visual
Basic.
Here different vendors like IBM, BEA, Oracle, HP implement these API’s using
J2EE specification. However, each vendor provides additional features also while
developing these API’s.
Summary
XML and SOAP provides the data and transport facility; web services provides the
protocols for discovery and connection; Using this service oriented applications can be
developed. Microsoft’s .NET framework and Sun led J2EE are the two platforms available
for the vendors to choose. Which platform to choose? The bottom line is that the choice
between the two will always be a choice between product platforms, driven by the services
a vendor can provide and a company’s vision of its future.
Questions
1. Web service
a. Is a process
b. Is a technology
c. Is a phenomenon
d. All of the above
2. Which of the following is not possible through web services
a. The desire to allow businesses to use the Internet
b. To improve collaboration with customers, partners and suppliers
c. To have complex trading partner interactions
d. None of the above
3. Web services registry support
a. White pages
b. Yellow pages
c. Green pages
NOTES d. All of the above
4. White pages provide
a. Contact information of a given business
b. Categories of businesses based on existing standards
c. Technical information about the web services provided by a given business
d. None of the above
5. Yellow pages provide
a. Contact information of a given business
b. Categories of businesses based on existing standards
c. Technical information about the web services provided by a given business
d. None of the above
6. Green pages provide
a. Contact information of a given business
b. Categories of businesses based on existing standards
c. Technical information about the web services provided by a given business
d. None of the above
7. Web services is meant for
a. Human to computer interactions
b. Computer to computer interactions
c. Human to human interactions
d. None of the above
8. The web services triad architecture does not include
a. A service provider
b. A service requester
c. A broker
d. Directories
9. Which of the following specification(s) are included in UDDI framework?
a. UDDI Programmer’s API Specification
b. UDDI Data structure Specification
c. UDDI Service specification
d. a & b
10. Which of the following is a major data structure used by UDDI Programmer API?
a. businessEntity
b. businessService
c. tModel
d. All of the above NOTES
11. QOS issues are addressed by UDDI by defining a calling convention that involves the
use of cached
a. businessEntity
b. businessService
c. bindingTemplates
d. None of the above
12. OASIS stands for
a. Organization for Advanced Software Initiative Systems
b. Organization for Advancement of Software and Information Systems
c. Organization for Advancement for Structured Information Standards
d. None of the above
13. The technical architecture of ebXML consists of
a. Messaging
b. Business processes
c. Registries and Repositories
d. All of the above
14. What is CPA?
a. Collaboration Protocol Agreement
b. Collaborative Partner Agents
c. Collaboration Protocol Application
d. None of the above
15. Which one is not a technology component of .NET?
a. Development tools
b. Specialized servers
c. Legacy software
d. Devices
16. MTS stands for
a. Microsoft Transaction System
b. Microsoft Technology Solutions
c. Microsoft Technical Support
d. Microsoft Transaction Server
17. Adapters are needed to
a. Integrate web services
b. Compose Web services
NOTES
NOTES
UNIT V
XML SECURITY
5.1 INTRODUCTION
You have been introduced with many concepts in web services. In this Unit we are
going to look into web services security. You may think that whether is it necessary to
study about this? Won’t the technologies introduced take care of the security issues? The
following section is going to address about it, the various levels of web security considerations
and the advantages of it.
5.1.1 Issues
The novel levels of exchanging; sharing of data and interoperability between them
introduces new challenges for security. Unlike the closed environments, this open and
loosely coupled environment has to meet the challenges for the secured environment. Some
of the issues are discussed below:
Generally the HTTP traffic flows via port 80, which is accepted as an open hole in
the firewall. All the web applications and their interfaces for the assigned work are
publicly available for every one’s access and they use port 80. Is it safe? Can we
assume that all the information coming through port 80 is safe? Applications that
provide front ends for the critical data will increasingly be exposed through HTTP
and accessible to anyone in the outside world. The important issue here is to
check out the security of the web service being utilized through port 80. For example,
these applications can even be published in a public directory for anyone to discover.
It may be argued that since data is being wrapped in SOAP envelopes it is secured.
Does it not provide a way to differentiate the structure and meaning of data being
sent over the wire?
Sending and receiving applications don’t have to be implemented by using the
same software platforms; i.e., they don’t have to have the same security libraries
from the same vendor. Therefore, don’t we need a set of standardized, platform-
independent security solutions?
If at all we are using some encryption technique for the XML file, which is generally
extremely verbose, is it not too expensive? Wrapping data in XML can increase
the size of the data that needs to be encrypted tremendously.
Currently, a new set of security techniques is being developed to address these issues.
In this many of the issues are still in the exploration stage having identified partial, immature
early stage solutions. However, existing security technologies with proven track records
still have its advantages. In fact, these new techniques are intended to build upon or augment
existing security technologies such as Public Key Infrastructures (PKI), Secure HTTP
(HTTPS), and the Secure Sockets Layer (SSL). Instead here we focus on new security
issues and solutions that have come about as a result of web services and their related
technologies. Before that let us look into the overview of security.
Now the next question is whether the Public Key Cryptography ensures all the three
dimensions of secured transformation?
Confidentiality: Since the owner of the private key never has to disclose the key to
anyone, the confidentiality is maintained in decrypting the message. Here the messages
encoded with a public key can be decoded only by the corresponding private key, ensuring
that the message is kept confidential.
Authentication: Even though the public key guarantees secrecy, it is not possible to
authenticate the sender of the message through the public key. On the other hand, if the
message is encoded with the private key by the authenticated user, the decoding can be
NOTES done by the public key. This ensures authentication on one side.
Data Integrity: Data integrity makes certain that the message received is the message
sent. How to ensure that using public key cryptography the document has not been tampered
or altered? Is it directly computable?
Hence what is the solution? Along with some other validating technique is it possible
to provide the security? Generally the technology for validating message is called digital
hashing. What is this digital hashing? It is nothing but an algorithmically generated short
string of characters that uniquely identifies a document. For example, a digital hash is
generated for a document and sent along with the document. If the document is tampered
by any means while communication, the re-computation of the digital hash will yield a
different result. If hashes do not match, it is the indication that the data integrity of the
document has been compromised.
A digital signature is like engraving the identity of the document across the face of the
document. In other words, it can be viewed as the electronic equivalent of a written signature.
Can we use this along with the public-key cryptography? Isn’t it a method to ensure
authentication and data integrity? Yes, a digital signature in combination with public-key
encryption can be used by distributed applications to authenticate the identity of the sender
of a message or document. It also ensures that the message or document is not changed.
Person X has to
Write the message
Create a digital hash of the message
Encrypt the original message and the digital hash with his private key
Send the encrypted document to his attorney
Is it clear about the overview of the security? Now let us see how private keys and
certificates have to be managed?
Keeping Certificates and Private keys protected is one of the biggest security
challenges. Even though private and public key pairs are very difficult to memorize (since
they are mathematically generated), the problem of ensuring about the confidentiality and
authentication of the keys received by the users still exists. To tackle this, Certificate
Authorities (CA) who represents “trusted entities” in the Web security, issues Certificates.
Once a CA is chosen, the certificates from companies signed by that authority are
trusted. However, trusting a CA is purely the user’s choice. Netscape Navigator and
Microsoft Internet Explorer come with a list of certificates for some trusted CA’s(……..).
The browsers support functionalities to manage the list of trusted CA’s and the expiry of
the certificates issued.
5.3 CANONICALIZATION
Once a hash is computed for a document, then a minor change like introduction of
white spaces in the document produces a completely different hash. In other words, a
secure hash is intolerant of minor changes in a document. This intolerance of change is
essential since a minor modification in the original document must be exhibited. However,
this feature presents a problem for XML documents. As we know, the XML documents
are frequently parsed and reparsed as they are transferred from the sender to the recipient.
In this process, the parsers can make insignificant modifications such as the elimination of
white space or an empty line.
Due to this, the mismatch in the hash will emerge. Hence a novel idea is to put the
XML document in a standard or normalized format before going for computing the digital
hash. This process of converting the XML document into a standard format is known as
Canonicalization. So we can be confident that the sender and receiver will compute the
same hash regardless of what processing occurred along the way. This canonical format
was standardized by the W3C in the XML-Canonicalization (xml-c14n) specification.
There are some guidelines and high-level rules are available to convert the document to an
xml-c14n-compliant canonical format. They are listed below:
Using the above said guidelines the XML document is normalized before going for
hash computation.
As we learned from the previous sections, the web services in the form of XML
require a security framework. The following section explains the three XML security
technologies which are driven by W3C.
XML Encryption
XML digital Signature
XML Key Management Services.
The building blocks of the XML security architecture are XML Encryption, XML
digital Signature and XML Key Management Services.
The aim of sending and receiving secured web services can be achieved by using
XML encryption methods provided the XML technology is being chosen as the technology
to realize the task taken. When the XML file to be encrypted contains lot more information
is it necessary to encrypt the entire contents or is there any facility available to encrypt the
selected information depending on the confidentiality of the information? Here the XML
encryption technology comes in handy with our requirement. The XML encryption supports
the encryption of all or part of an XML document. The specification is flexible, which NOTES
means that it allows for complete or partial document encryption in the following way:
The complete XML document
An element and all its sub elements
The content of an XML element
A reference to a resource outside the document
Thus XML encryption extends the power of the XML digital signature system by
enabling the encryption of the message that has been signed digitally. Since XML encryption
is not bound to any specific encryption scheme, additional information is to be provided on
the following:
The information itself or a reference to the location of the data
Information or a reference to information via a uniform resource identifier about
the keys used in the encryption
Here, the specification outlines a standard way to encrypt any form of digital content
and permits encryption of a full XML message, a partial XML message, or a XML message
that contains sections that were previously encrypted. For easy remembrance of the
procedure the following steps are given:
Selecting all or part of a XML document to be encrypted
Applying Canonicalization on the entire XML document
Using public-Key encryption, encrypting the resulting XML document after
Canonicalization
Sending the encrypted XML to the deliberate recipient
Let us see how to specify the encryption of a full XML message, a partial XML
message, or a XML message that contains sections that were previously encrypted with an
example file.
The concept is being explained with an example. Now days, the purchase of items
through internet is common. The support for these online transactions is payment through
credit cards, which needs secured information exchange between the parties. Here is an
example of Mr. John X‘s purchase of an item through the credit card. The following XML
document contains the credit card information related to one of the purchase made by Mr.
John X.
<?xml version =’1.0’?/>
<PaymentDetail xmlns = ‘http:/universalbank.org’>
The above segment indicates that Mr. John X is using a credit card bearing the number
252552522255 with a limit of Rs 25,000 INR. The account is available in the bank called
Universal Bank and valid up to 9/11.
As we discussed earlier there are different ways of applying encryption to the XML
document. This totally depends on which part of the document is to be kept confidential.
For example if we intend not to disclose any information about the purchase then the
whole document to be encrypted. Otherwise if only the Credit card information to be
protected then only the CreditCard element to be encrypted. Or only the CreditCard
Number to be protected then the element Cre_Number to be encrypted. We will see the
XML equivalent of each scenario in the following examples.
If situation arises that the complete document beginning at the root tag to be encrypted,
then all the elements are encrypted as a single encrypted string in the following way:
<?xml version =’1.0’?/>
<EncryptedData xmlns = ‘http://www.w3.org/2009/01/xmlenc#’>
<CipherData>
<CipherValue> A1B2C3D4E5F6G7H8 </CipherValue>
</ CipherData>
</EncryptedData>
Depending on a particular situation, if the name of the person is less sensitive than the
other credit card information then it is possible to selectively keep the critical data
confidential. In this example, if it is felt that the name of the person can be shown out but
not the other information. This can be achieved by encrypting only the CreditCard information
as shown below.
<?xml version =’1.0’?/>
<PaymentDetail xmlns = ‘http:/universalbank.org’>
If it is required only to encrypt the part of XML element, but not the entire element it
is also possible. In other words, if only the card’s number, issuer and the validity period to
be kept confidential then it is possible by writing the following code.
<?xml version =’1.0’?/>
<PaymentDetail xmlns = ‘http:/universalbank.org’>
<Name> John X Ramanoria </Name>
<CreditCard Limit = ‘25000’ Currency = ‘INR’>
<EncryptedData
Type=‘http://www.w3.org/2009/01/xmlenc#Content’
xmlns = ‘http://www.w3.org/2009/01/xmlenc#’>
<CipherData>
<CipherValue> A1B2C3D4E5F6G7H8 </CipherValue>
</ CipherData>
</EncryptedData>
</CreditCard>
</ PaymentDetail>
In this example it has been encrypted only the following elements Cre_Number,
Cre_Issuer and the Validity_Upto which is shown below:
<Cre_Number> 2525 5252 2255</Cre_Number>
<Cre_Issuer> Universal Bank</Cre_Issuer>
<Validity_Upto>09/11 </Validity_Upto>
In this way, the XML digital Signature specification provides the facilities to define
elements required and the rules for processing it. These signatures provide integrity, message NOTES
authentication and signer authentication services for XML the data.
The XML digital Signature specification defines a set of XML elements for describing
the details of the signatures. Here is the list of some elements.
SignedInfo
CanonicalizationMethod
SignatureMethod
Reference
KeyInfo
Transforms
DigestMethod
DigestValue
Let us see the steps to be done to digitally sign an XML document using the XML
signature elements:
Create a SignedInfo element with SignatureMethod, CanonicalizationMethod
and Reference
Canonicalize the XML document
Calculate the SignatureValue based on algorithms specified in SignedInfo
Construct the Signature element that includes SignedInfo, KeyInfo and
SignatureValue
The explanation for the elements specified with example XML segments follows:
Look into the example XML segment. This simply explains a purchase order about an item
to be delivered to a particular address.
<PurchaseOrder xmlns=”url: xxx.purchase”>
<DeliveredTo countryname=”INDIA”>
<cus_name>Veda </cus_name>
<street>12 Chittankudi</street>
<city>Puducherry</city>
<state>Pondicherry</state>
<pincode>605004</pincode>
</DeliveredTo>
NOTES <items>
<item_names partNum=”52525252">
<productName>KinderJoy candy</productName>
<quantity>200</quantity>
<price>6000</price>
</item_names>
</items>
</PurchaseOrder>
Now look into the following XML segment with the signature information.
<PurchaseOrder xmlns=”url: xxx.purchase”>
<DeliveredTo countryname=”INDIA”>
<cus_name>Veda </cus_name>
<street>12 Chittankudi</street>
<city>Puducherry</city>
<state>Pondicherry</state>
<pincode>605004</pincode>
</DeliveredTo>
<items>
<item_names partNum=”52525252">
<productName>KinderJoy candy</productName>
<quantity>200</quantity>
<price>6000</price>
</item_names>
</items>
<Signature Id=”EnvelopedSig”
xmlns=”http://www.w3.org/2000/09/xmldsig#”>
<SignedInfo Id=”EnvelopedSig.SigInfo”>
<CanonicalizationMethod Algorithm=
“http://www.w3.org/TR/2001/REC-xml-c14n-20010315”/>
<SignatureMethod Algorithm=
“http://www.w3.org/2000/09/xmldsig#rsa-sha1”/>
<Reference Id=”EnvelopedSig.Ref” URI=””>
<Transforms>
<Transform Algorithm=
“http://www.w3.org/2000/09/xmldsig#enveloped-signature”/>
</Transforms> NOTES
<DigestMethod Algorithm=
“http://www.w3.org/2000/09/xmldsig#sha1”/>
<DigestValue>
yHIsORnxE3nAObbjMKVo1qEbToQ=
</DigestValue>
</Reference>
</SignedInfo>
<SignatureValue Id=”EnvelopedSig.SigValue”>
GqWAmNzBCXrogn0BlC2VJYA8CS7gu9xH/XVWFa08e
</SignatureValue>
<KeyInfo Id=”EnvelopedSig.KeyInfo”>
<KeyValue>
<RSAKeyValue>
<Modulus>
AIvPYJVd5zFrRRrJzB/awFLXb73kSlWqHao+3nxuF38r
ZPRTkGIKjD7rw4 Vvml7nKlqWg/NhCLWCQFWZ
</Modulus>
<Exponent>AQAB</Exponent>
</RSAKeyValue>
</KeyValue>
</KeyInfo>
</Signature>
</PurchaseOrder>
Is it Complex? Is the XML segment is very large? Even though it looks very large in
size and complex, actually it is not. Let us explore each element in detail.
Digest
Here we have two <DigestMethod> and <DigestValue> elements in the following
way.
<DigestMethod Algorithm=
“http://www.w3.org/2000/09/xmldsig#sha1”/>
<DigestValue>
yHIsORnxE3nAObbjMKVo1qEbToQ=
</DigestValue>
Let us define digest first. Digest is nothing but the application of a mathematical algorithm/
NOTES secured hash to a portion of message, which ensures the data being signed, cannot be
tampered with. As soon as the digest is created the next step is to add all the additional
signed information. Then again create the digest of it. You may think is the job over? No it
is not yet. Encrypt it again and write it into the XML message itself as the digital signature.
In the above example, the selected algorithm and the initial digest are contained in the
<DigestMethod> and <DigestValue> elements.
Generating the message digest is the next issue. This is done with the help of
<Reference> element. The <Reference> element includes the information required to do
data transformation or normalization used along the way, including canonicalization. For
illustration, you can associate a digital signature to an XML document in different ways as
specified below:
Enveloped
Enveloping
Detached
The signature is a sibling of the element being signed and is referenced by a local link,
or it can be located elsewhere on the network.
The above information should be specified using the <Transform> tag which is available
inside the signature. In the example specified , we chose to use the enveloped method:
<Transforms>
<Transform Algorithm=
“http://www.w3.org/2000/09/xmldsig#enveloped-signature”/>
</Transforms>
Similarly other examples of transforms are base64 encoding, XPATH filtering, XSLT
transformation, and schema validation.
As we pointed out earlier, the selected algorithm and the digest are specified with
these tags: NOTES
<DigestMethod Algorithm= “http://www.w3.org/2000/09/xmldsig#sha1”/>
<DigestValue> yHIsORnxE3nAObbjMKVo1qEbToQ= </DigestValue>
Now it is important to take a close look at the <Signature> element. This <Signature>
element has <SignedInfo> element; it specifies the data that is actually signed and the
algorithms used to sign it. <SignedInfo> has three elements: <CanonicalizationMethod>,
<SignatureMethod>, and <Reference>.
The next step involved in creating the digest is tracking and specifying the actual
method used to create the signature (denoted by the <SignatureMethod> element). After
the canonical version of the XML is derived, the data that is part of the <SignedInfo>
element desires to be converted into the actual signature value (and placed in the
<SignatureValue> element). The <SignatureMethod> element specifies the algorithm that
will be used for this operation.
The algorithm which is used to create the signature and, finally, the signature itself, are
specified in the <SignatureMethod> tag and <SignatureValue> tag:
e76Tduvq/N8kVd0SkYf2QZAC+j1IqUPFQe8CNA0CfUrHZdiS4TDDVv4sf0V1c6UBj7
zT 7leCQxAdgpOg/2Cxc=
</SignatureValue>
In this example segment, when the receiver gets the message, the signature is decrypted
using the sender’s public key, the verified digest, and by verifying the sender’s signature.
Who has to provide the Key information? In the following listing, the <KeyInfo> element
holds the decryption key:
<KeyInfo Id=”EnvelopedSig.KeyInfo”>
<KeyValue>
<RSAKeyValue>
<Modulus>
mJVd5zFrRRrJzB/awFLXb73kSlWqHao+3nxuF38r
Rk0HmqgsoKgWVvml7nKlqWg/NhCLWCQFWZ
</Modulus>
NOTES <Exponent>AQAB</Exponent>
</RSAKeyValue>
</KeyValue>
</KeyInfo>
Here note that the XML signature doesn’t address trust of such key information.
Then it is responsibility of whom? Generally, the application has to determine how trustworthy
the key is. But for there is another way to verify that the supplied decryption key does
belong to the sender, there is little point to the process. Anyone could intercept the message,
change its contents, regenerate a public/private key pair, and re-sign the document. This
will assert the public key belongs to the sender. This is the place; where the digital certificates
come into the picture.
The certificate contains the binding between the identity of the public key’s owner
and the key itself. For example, if the <KeyInfo> is omitted, the recipient is likely to
identify the key that will be used, based on the application context. This type of issue is
addressed in the XKMS specification, which is discussed in the later. Using XKMS or
another PKI infrastructure, the recipient of the message can obtain the digital certificate,
extract the public key from it, and verify that this key does belong to the sender.
5.5 XKMS
Keeping the public and private keys, digital signatures, and digital certificates organized
and secure is one of the biggest challenges for deploying all these new encryption, digital
signature, and authentication technologies. Hence the need for a methodology for the
management of the security components has been raised. In this progression, the XML
Key Management Specification (XKMS) is been an emerging effort under the backing of
the W3C. The goal of XKMS is to provide standardized XML-based transaction definitions
for the management of authentication, encryption, and digital signature services. The previous
section discussed about the XML Encryption and XML Digital Signature specifications.
However, these specifications assume that the web service responsible for processing the
XML exists in an environment where keys and certificates are kept safe and secure. The
assumption here is that the web service programmer is aware of which certificates and
keys to use. XKMS will provide a set of XML definitions to allow developers to contact
a third party. They will be helpful in locating and providing the appropriate keys and
certificates. The usefulness for allowing a third party to do this confidential job is to free the
web service programmer from having to track the availability of keys or certificates and
ensure their validity.
In other words, XKMS will provide a standardized set of XML definitions to do the
following: NOTES
Allowing developers to contact and use remote trusted third-party services
The trusted third-party services will provide the following services:
encryption and decryption services
creation of keys
management of keys
authentication of keys and digital signatures
The specification standards specify a set of tags which is used to query external key
management and signature validation services. For example, to know about the
authentication of the certificate, a client might ask a remote service to answer questions
such as, “Is it a valid certificate?” or, “Provide the value of the key managed by you.” Thus
the facility to manage the keys is provided in XKMS.
XKMS was submitted to the W3C by Microsoft, VeriSign and web-Methods and is
backed by a range of companies like HP, IBM Lenova etc. Thus XKMS is one of the
three W3C specifications that define the XML security architecture.
On the whole the XKMS specifies the protocols for distributing and registering public
keys. This is suitable for use in conjunction with the planned standard for XML signature
and as an additional standard for XML encryption. The structure of XKMS contains two
sections:
XML Key Information Service Specification (X-KISS)
XML Key Registration Service Specification (X-KRSS)
When ever, a person is signing a document it is not necessary to specify any key
NOTES information except that the value for the element <KeyInfo>. The value includes the key
name, certificate name, key identifier and so on. Otherwise a link may be provided to a
location which contains the required KeyInfo details.
The Registration of the public key information is done through the protocol X-KRSS
specifies. Once the key is registered it can be used along with other web services. The
same protocol may be also used for recovery of the private keys. Since the protocol
provides for authentication of the applicant, the key pair public key and private key may
be generated by the applicant. This is the proof of possession of the private key. A means
of communicating the private key to the client is provided if the private key is generated by
the registration service.
The following section explains the key retrieval, location service and validates service
with some example XML documents:
Key retrieval
If the client wants the decryption key from a remote source, XKMS provides a
simple method. Using the tag <Retrieval Method> inside the <KeyInfo> element which is
available in the XML signature can be used for this. The following segment assumes that a
service exists that can provide information about a given key.
<KeyInfo>
<RetrievalMethod
URI=”http://www.KeyFil.samp/ValidateKey”
Type=”http://www.w3.org/2009/01/xmldsig#X509Certificate”/>
</KeyInfo>
This search for a key is very simple and does not require the service to enforce the
validity of the key it returns.
Location service
If the application client wants to query a service for public key information then there
are some set of tags available in the location service. If a web service client wants to
encrypt something based on the value of the recipient’s public key, then the web service
client should know the key value. For this requirement, it has to contact the key location
service to obtain that key. The following listing shows the <Locate>, <Query>, and
<Respond> tags used in the request:
<Locate>
<Query> NOTES
<:KeyInfo>
<KeyName>Varanam AAyeeram</:KeyName>
</KeyInfo>
</Query>
<Respond>
<string>KeyName</string>
<string>KeyValue</string>
</Respond>
</Locate>
In this example XML segment, the <Query> tag provides the name of the requested
key, and the <Respond> element lists the items that the client would like to know about.
The response looks like this:
<LocateResult>
<Result>Success</Result>
<Answer>
<KeyInfo>
<KeyName> Varanam AAyeeram </KeyName>
<KeyValue>the actual key value</KeyValue>
</KeyInfo>
</Answer>
</LocateResult>
Validate Service
The correspondence between the key and an attribute should be validated. Here the
Validate Service facility available through a trusted third party can be used to get the job
done. That third party validates the binding between a key and an attribute. For instance,
look into the following query:
<Validate>
<Query>
<Status>Valid</Status>
<KeyInfo>
<KeyName>...</KeyName>
<KeyValue>...</KeyValue>
</KeyInfo>
</Query>
NOTES <Respond>
<string>KeyName</string>
<string>KeyValue</string>
</Respond>
</Validate>
If this query is being sent to the Validate Service then the following result would be
produced.
<ValidateResult>
<Result>Success</Result>
<Answer>
<KeyBinding>
<Status>Valid</Status>
<KeyID>http://www.xmltcenr.org/assert/20-39 </KeyID>
<KeyInfo>
<KeyName>...</KeyName>
<:KeyValue>...</KeyValue>
</ds:KeyInfo>
<ValidityInterval>
<NotBefore>2000-09-20T12:00:00</NotBefore>
<NotAfter>2000-10-20T12:00:00</NotAfter>
</ValidityInterval>
</KeyBinding>
</Answer>
</ValidateResult>
The XML segment clearly indicates that the ‘result’ for the given ‘Query’ is generated
and sent to the application client. The value for the <Result> element is success and it
indicates that the request was processed successfully by the service. Similarly the element
<Status> indicates the results of the processing. The value ‘valid’ in this case, represents
that the result is Valid.
Here the element <ValidityInterval> is an optional element. It indicates that the timespan
for which the Validate Service’s results are considered valid. Now the question may arise,
once the digital certificate or keys are generated are they valid without any time span? It is
not like that they are not unconditionally valid; they can be (and frequently are) assigned a
specific time limit, after which they expire and are no longer valid.
In addition, XKMS also defines requests and responses for the following areas:
NOTES
Key registration
Key revocation
How to send a request to the third-party KMS to tell it that you no longer want it to
manage the key on your behalf?
Key recovery
If you forgot your private key, then what to do? XKMS gives some solutions to this.
It describes how to send a request to obtain the private key and what the response looks
like. The specification does not state the rules under which the private key should be
returned. For example, it may be the policy of the service to cancel the old key and issue
a new one after certain period. However, that decision is up to the policy of the individual
provider.
Verisign is one of the primary drivers of XKMS. They have already released a Java
toolkit that supports XKMS development. To download the product, visit http://
www.xmltrustcenter.org/xkms/download.htm.
Java Toolkits
IBM XML Security Suite and the Phaos XML Toolkit are some of the JAVA Toolkits
for XML security available. The toolkits use Xerces and Xalan to parse the XML data.
The assembly of signatures is done by using their own APIs. The same is used for encrypting
the dat a. The Phaos sample simply used parser APIs such as
doc.getElementsByTagName(tagName) to access the element to be encrypted, as shown
in the following listing:
// Copyright © Phaos Technologies
public class XEncryptTest
{
public static void main (String[] args) throws Exception
{
... // usage, command line args...
// get the XML file and retrieve the XML Element to be encrypted
File xmlFile = new File(inputFileName);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
NOTES Document doc = db.parse(xmlFile);
Element inputElement = null;
NodeList list = doc.getElementsByTagName(tagName);
if (list.getLength() != 0)
inputElement = (Element) list.item(0);
else
{
System.err.println(“XML element with tagName “ + tagName + “ unidentified.”);
System.exit(1);
}
// Create a new XEEncryptedData instance with the owner
// Document of the input xml file,the data type URI and
// the Id “ED” for this EncryptedData element.
XEEncryptedData encData
= XEEncryptedData.newInstance(doc, “ED”, dataType);
... // determine encryption algorithm
// set up the EncryptionMethod child element
XEEncryptionMethod encMethod = encData.createEncryptionMethod(algURI);
encData.setEncryptionMethod(encMethod);
// set up the symmetric key to be used in encryption
SymmetricKey key = null;
File keyFile = new File(keyFileName);
... // File stuff
// set up the ds:KeyInfo child element with the keyName
XSKeyInfo keyInfo = encData.createKeyInfo( );
keyInfo.addKeyInfoData(encData.createKeyName(keyName));
encData.setKeyInfo(keyInfo);
// set a nonce value to be prepended to the plain text
byte[] nonce = new byte[16];
encData.setNonce(RandomBitsSource.getDefault().randomBytes(nonce));
// encrypt the XML element and replace it with the
// newly generated EncryptedData element
System.out.print(“Encrypting the XML data ... “);
XEEncryptedData newEncData =
The Phaos toolkit was much easier to set up and run than the IBM toolkit. This piece
of makes a call to encryptAndReplace( ). This method takes the element that we’ve given
it, encrypts it by using the given key, and replaces the original element with the appropriately
tagged, encrypted element.
As a whole, it can be said that Web services security is still an emerging area and proper
handling of this portion has to be done by researchers and vendors together.
Single-sign-on
What is this Single-sign-on? It is nothing but the ability for an end user or application
to access other applications within a secure environment. It has to be done without needing
to be validated by each application. The most common example of single-sign-on technology
is in web-based corporate intranet applications.
What is the use of this environment? In this setting, the users may want to use various
applications that allow access to their timetable, Project schedule, expense reports and
health benefits. If each user of the application need to be authenticated individually then the
following may occur such as in convenienence, slow, and limiting the value of the intranet
site. The single sign on is one of the solution which allows access to all applications without
additional intervention after the initial sign on, using a profile that defines what the user is
allowed to do.
The single-sign-on concept is easily extended to web services. Web services can be
given a permit (placed in an XML/SOAP message) that can be used to validate the service
with other web services. However, the secure use of web services will depend on the
ability to exchange user credentials on a scale never seen before. Individual services will
NOTES reside in a variety of protected environments, each using various security products and
technologies. Providing a way to integrate these environments and enable their interoperability
is critical for the secure and effective use of these services.
Signing of XML documents needs care, since any change in the document like
introduction of white space, change of case tend to change the signature. The following
two points to be kept in mind when going for signing the document:
Content Presentation techniques may introduce changes
Transformation may alter the content
While due consideration is not been given for handling the original and transformed
document, it will return a different result than intended. As in any security infrastructure,
the security of an overall system will depend on the security and integrity of procedures
and personnel as well as procedural enforcement.
Summary
One of the important aspects of web commerce is security. While it is possible to use
standard security protocols to encrypt and authenticate XML, there are matters relating to
the structure and definition of XML and its use in SOAP. Here Soap requires specialized
security solutions. W3C has developed XML Encryption and XML Signature to provide
for the selective signing and encryption of XML elements and content. We have also seen
the issues of trust handling by XKMS, which builds on the services of XML Signature and
XML Encryption and relies on established certificate authorities.
QUESTIONS
NOTES
Part A
1. Define non-repudiation.
2. What do you mean by Data integrity?
3. What do you mean by confidentiality?
4. What are the two basic approaches used to cryptography?
5. What kind of cryptography is used in fixed devices such as ATM machines?
6. Define digital hashing.
7. Who are certificate Authorities (CA)?
8. Define canonicalization.
9. Define digest.
10. Explain key management.
11. ——— is one of the three W3C specifications that define the XML security
architecture.
12. XKMS specifies the protocols for distributing and registering ————
13. The structure of XKMS contains ——— and ———
14. What do you mean by single sign-on?
15. What are the two key points for signing XML documents?
Part B
1. What are the basic security needs of an e-business application?
2. Discuss about single-key cryptography.
3. Discuss about Public-key cryptography.
4. How does public-key cryptography address the three dimensions of secured
transactions of an e-business application?
5. What do you mean by digital signature? Explain.
6. Discuss about certificates and private key management.
7. Discuss about canonicalization.
8. What are the guidelines and high-level rules to convert the document to an xml-c14n-
compliant canonical format?
9. Does the XML encryption support the encryption of an entire XML document or
part of it? Explain.
10. Explain with a suitable example the process of encrypting an XML data.
11. What are the elements of XML digital signature specification?
12. What are the steps in generating XML digital signatures?
Part C
1. What are the security issues of an opened, loosely-coupled environment?
2. Discuss in detail about the two approaches to cryptography.
3. Discuss in detail about the XML security framework and its components.
4. Explain with a suitable example the process of encrypting an XML data.
5. Explain in detail about how XML digital signatures are generated and used.
6. Explain how XKMS defines requests and responses for key management.
ANSWERS
1 - a, 2 - d , 3 - c , 4 - a , 5 - a, 6 - a, 7 - c, 8 - d, 9 - c, 10 - d, 11 - c, 12 -a , 13 -d , 14
-c , 15 - d
NOTES NOTES
NOTES NOTES