0% found this document useful (0 votes)
12 views

E-Com Notes (Chapters 6-9)

Uploaded by

rajanityagi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

E-Com Notes (Chapters 6-9)

Uploaded by

rajanityagi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 130

/1/2010

E-business management
By
Rajani tyagi
Contents
EBusiness Introduction
.................................................................................................................................................... 2
EBusiness Strategies
......................................................................................................................................................
15
Integration of Applications
............................................................................................................................................. 2
E!o""erce Infrastructure
............................................................................................................................................ #
$age
1
E-Business Introduction
E-Business vs. E-commerce
%&ile so"e use eco""erce and e'usiness interc&angea'ly( t&ey are distinct concepts.
Electronic business( co""only referred to as )eBusiness) or )ebusiness)( "ay 'e
de*ned as t&e application of infor"ation and co""unication tec&nologies +I!,- in support
of all t&e acti.ities of 'usiness. !o""erce constitutes t&e e/c&ange of products and ser.ices
'et0een 'usinesses( groups and indi.iduals and can 'e seen as one of t&e essential acti.ities
of any 'usiness. Electronic co""erce focuses on t&e use of I!, to ena'le t&e e/ternal
acti.ities and relations&ips of t&e 'usiness 0it& indi.iduals( groups and ot&er 'usinesses.
E!o""erce Is a particular for" of eBusiness. Electronic 'usiness "et&ods ena'le
co"panies to lin1 t&eir internal and e/ternal data processing syste"s "ore e2ciently and
3e/i'ly( to 0or1 "ore closely 0it& suppliers and partners( and to 'etter satisfy t&e needs and
e/pectations of t&eir custo"ers. !o"pared to e!o""erce( eBusiness is a "ore generic ter"
'ecause it refers not only to infor"ation e/c&anges related to 'uying and selling 'ut also
ser.icing custo"ers and colla'orating 0it& 'usiness partners( distri'utors and suppliers.
EBusiness enco"passes sop&isticated 'usinessto'usiness interactions and colla'oration
acti.ities at a le.el of enterprise applications and 'usiness processes( ena'ling 'usiness
partners to s&are indept& 'usiness intelligence( 0&ic& leads( in turn( to t&e "anage"ent
and opti"i4ation of interenterprise processes suc& as supply c&ain "anage"ent. 5ore
specifically( eBusiness ena'les co"panies to lin1 t&eir internal and e/ternal processes
"ore e2ciently and 3e/i'ly( 0or1 "ore closely 0it& suppliers and 'etter satisfy t&e needs
and e/pectations of t&eir custo"ers.
In practice( e'usiness is "ore t&an just eco""erce. %&ile e'usiness refers to "ore
strategic focus 0it& an e"p&asis on t&e functions t&at occur 0&en using electronic
capa'ilities( eco""erce is a su'set of an o.erall e 'usiness strategy. Eco""erce see1s to
add re.enue strea"s using t&e %orld %ide %e' or t&e Internet to 'uild and en&ance
relations&ips 0it& clients and partners and to i"pro.e ef*ciency using t&e E"pty 6essel
strategy. 7ften( e co""erce in.ol.es t&e application of 1no0ledge "anage"ent syste"s.
E'usiness in.ol.es 'usiness processes spanning t&e entire .alue c&ain8 electronic
purc&asing and supply c&ain "anage"ent( processing orders electronically( &andling
custo"er ser.ice( and cooperating 0it& 'usiness partners. Special tec&nical standards for e
'usiness facilitate t&e e/c&ange of data 'et0een co"panies. E'usiness soft0are solutions
allo0 t&e integration of intra and inter *r" 'usiness processes. E'usiness can 'e
conducted using t&e %e'( t&e Internet( intranets( e/tranets( or so"e co"'ination of t&ese.
Basically( electronic co""erce +E!- is t&e process of 'uying( transferring( or e/c&anging
products( ser.ices( and/or infor"ation .ia co"puter net0or1s( including t&e internet. E!
can also 'e 'ene*ted fro" "any perspecti.e including 'usiness process( ser.ice( learning(
colla'orati.e( co""unity. E! is often confused 0it& e'usiness.
In eco""erce( infor"ation and co""unications tec&nology +I!,- is used in inter'usiness
or interorgani4ational transactions +transactions 'et0een and a"ong *r"s/organi4ations-
and in 'usinesstoconsu"er transactions +transactions 'et0een *r"s/organi4ations and
indi.iduals-.
In e'usiness( on t&e ot&er &and( I!, is used to en&ance one9s 'usiness. It includes any
process t&at a 'usiness organi4ation +eit&er a forpro*t( go.ern"ental or nonprofit entity-
conducts o.er a co"puter"ediated net0or1.
A "ore co"pre&ensi.e de*nition of e'usiness is8 :The transformation of an organizations
processes to deliver additional customer value through the application of technologies,
philosophies and computing paradigm of the new economy.;
$age
2
,&ree pri"ary processes are en&anced in e'usiness8
< Production processes( 0&ic& include procure"ent( ordering and replenis&"ent of
stoc1s= processing of pay"ents= electronic lin1s 0it& suppliers= and production control
processes( a"ong ot&ers=
< Customerfocused processes, 0&ic& include pro"otional and "ar1eting efforts(
selling o.er t&e Internet( processing of custo"ers9 purc&ase orders and pay"ents( and
custo"er support( a"ong ot&ers
< Internal management processes( 0&ic& include e"ployee ser.ices( training(
internal infor"ations&aring( .ideoconferencing( and recruiting. Electronic
applications en&ance infor"ation flo0 'et0een production and sales forces to
i"pro.e sales force producti.ity. %or1group co""unications and electronic pu'lis&ing
of internal 'usiness infor"ation are li1e0ise "ade "ore e2cient.
EBusiness goes far 'eyond eco""erce or 'uying and selling o.er t&e Internet( and deep
into t&e processes and cultures of an enterprise. It is t&e po0erful 'usiness en.iron"ent t&at
is created 0&en you connect critical 'usiness syste"s directly to custo"ers( e"ployees(
.endors( and 'usiness partners( using Intranets( E/tranets( eco""erce tec&nologies(
colla'orati.e applications( and t&e %e'.
E'usiness is a "ore strategic focus 0it& an e"p&asis on t&e functions t&at occur 0&en using
electronic capa'ilities 0&ile Eco""erce is a su'set of an o.erall e'usiness strategy. E
co""erce see1s to add re.enue strea"s using t&e %orld %ide %e' or t&e Internet to 'uild
and en&ance relations&ips 0it& clients and partners and to i"pro.e e2ciency 0&ile
Electronic 'usiness "et&ods ena'le co"panies to lin1 t&eir internal and e/ternal data
processing syste"s "ore e2ciently and 3e/i'ly( to 0or1 "ore closely 0it& suppliers and
partners( and to 'etter satisfy t&e needs and e/pectations of t&eir custo"ers.
EBusiness is at t&e enterprise application le.el and enco"passes sop&isticated '2'
interaction and colla'oration acti.ities. Enterprise Application Syste"s suc& as ER$( !R5(
S!5 for" an integral part of eBusiness strategy and focus.
Critical Factors with respect of e-Business
EBusiness supports 'usiness processes along t&e entire .alue c&ain8 Electronic purc&asing
+E$rocure"ent-( S!5 +Supply !&ain 5anage"ent-( $rocessing orders electronically(
!usto"er Ser.ice > !ooperation 0it& 'usiness partners.
7ne of t&e o'jecti.es of eBusiness is to pro.ide sea"less connecti.ity and integration
'et0een 'usiness processes and applications e/ternal to an enterprise and t&e
enterprise9s 'ac1 o2ce applications suc&a as 'illing( orger processing( accounting(
in.entory and recei.a'les( and ser.ices focused to total supply c&ain "anage"ent and
partners&ip including product de.elop"ent( ful*ll"ent( and distri'ution. In t&is respect( e
Business is "uc& "ore t&an e!o""erce.
,o succeed in eBusiness it is crucial to co"'ine tec&nological de.elop"ents 0it& corporate
strategy t&at redi*nes a co"pany9s role in t&e digital econo"y 0&ile ta1ing into account
its .arious sta1e&olders. It is i"perati.e to understand t&e issues( e.aluate t&e options(
and de.elop tec&nology orientation plans. An eBusiness strategy &elps organi4ations identify
t&eir eBusiness concerns( assess t&eir infor"ation needs( analy4e to 0&at degree e/isting
syste"s ser.e t&ese o'jecti.es( pinpoint speci*c i"pro.e"ents( deter"ine t&e de.elop"ent
stages of eBusiness solutions and attain concrete and "easura'le results. ,&us( it is clear
t&at eBusiness solutions are not only a'out tec&nology.
A classic e/a"ple is SA$ syste"s integrations for any organi4ation. ,&is itself is ta1en up as
a project and e/ecuted 0it& great attention to detail. A "inute logical error in interpretation
of t&e *r"9s o'jecti.es could result in t&e entire syste" 'eing re0or1ed fro" scratc&.
$age
?
EBusiness allo0s for rede*nition of .alue( co"petiti.eness and t&e .ery nature of
transactions and it a@ects all areas of an organi4ation. It is crucial to co"'ine tec&nology
and 'usiness strategy 0&ile ta1ing into account .arious sta1e&olders
An E'usiness Strategy &elps to
< Identify e'usiness
concerns < Assess
info needs
< Analy4e e/isting syste"s
< I"pro.e"ents reAuired in e/isting syste"s
< Beter"ine t&e stages of de.elop"ent of
solutions < Attain concrete and
"easura'le results.
Characteristics of e-Business
,o e"p&asi4e( eBusiness is not si"ply 'uying and selling 'ut enco"passes t&e
e/c&ange of "any 1inds of infor"ation( include online co""ercial transactions. EBusiness
is a'out integrating e/ternal co"pany processes 0it& an organi4ation9s internal 'usiness
processes= as suc&( a .ariety of core 'usiness processes could e/ploit an e Business
infrastructure.
,&ese include a"ong ot&ers8
< !olla'orati.e $roduct Be.elop"ent
< !olla'orati.e $lanning( Corecasting and
Replenis&"ent < $rocure"ent and 7rder
"anage"ent
< 7perations and Dogistics
Collaborative Product evelopment
,&is is one of t&e fastest gro0ing tec&nologies in engineering 0it& so"e for" of solutions
'eing i"ple"ented in a range of industries suc& as auto"oti.e( aerospace( agricultural
"ac&inery etc. It contri'utes to0ards "a1ing products in a s&ort ti"e span 0&ile
"aintaining Auality and reducing cost.
It also aids in "a/i"i4ing ti"eto"ar1et 'enefits 0&ile "aintaining control o.er product
de.elop"ent infor"ation. By integrating design and testing cycles of products 0it& t&ose of
suppliers( a *r" can s&orten t&e co"plete cycle of its products. ,&is clearly( reduces t&e total
cost of t&e product cycle( > e.en "ore i"portantly( it reduces t&e ti"e t&at is needed to
'ring products to t&e "ar1etplace. !olla'orati.e product de.elop"ent solutions offer ER$
integration and S!5.
Collaborative Planning, Forecasting and !eplenishment
,&is is a process in 0&ic& 5anufacturers( Bistri'utors and Retailers 0or1 toget&er to plan(
forecast and replenis& products. In eBusiness relations&ips colla'oration ta1es t&e for" of
s&aring infor"ation t&at i"pacts in.entory le.els and "erc&andise flo0.
Collaboration points" sales forecasts( in.entory reAuire"ents( "anufacturing and logistic
lead ti"es( seasonal set sc&edules( ne0/re"odel storage plans( pro"otional plans etc
#oal" ,o get t&e partners to 0or1 toget&er to i"pro.e lo0er supply cycle ti"es( i"pro.e
custo"er ser.ice( lo0er in.entory costs( i"pro.e in.entory le.els and ac&ie.e 'etter control
of planning acti.ities
Procurement and $rder management
Electronic procure"ent or E$rocure"ent can ac&ie.e significant sa.ings and ot&er
'ene*ts t&at i"pact t&e custo"er. ,o support procure"ent and order "anage"ent
processes( co"panies use an integrated electronic ordering process and ot&er online
resources to increase e2ciency in purc&asing operations.
$age

Bene%ts" cost sa.ings( 'etter custo"er ser.ice 'y controlling t&e supply 'ase(
negotiating [email protected] 'uying preferences( and strea"lining t&e o.erall procure"ent process.
$perations & 'ogistics
Dogistics is t&at part of t&e supply c&ain process t&at plans( i"ple"ents and controls t&e
e@icient( [email protected] flo0 and storage of goods( ser.ices and related infor"ation fro" t&e point
of origin to point of consu"ption in order to "eet custo"er reAuire"ents. ,o "a1e t&is
&appen( transportation( distri'ution( 0are&ousing( purc&asing > order "anage"ent
functions "ust 0or1 toget&er. Dogistics in t&e eBusiness era is all a'out !olla'oration t&e
s&aring of critical and ti"ely data on t&e "o.e"ent of goods as t&ey 3o0 fro" ra0 "aterial(
all t&e 0ay to t&e enduser.
7perations and Dogistics processes are 'ased on open co""unication 'et0een net0or1s of
trading partners 0&ere integrated processes and tec&nology are essential for &ig&
perfor"ance logistics operations. ,&ese solutions &elp "anage t&e logistics process 'et0een
'uyers and suppliers( 0&ile eli"inating costly discrepancies 'et0een purc&ase order( sales
order and s&ipping infor"ation. By eradications t&ese .ariances and inconsistencies
i"pro.e"ents in t&e supply c&ain "ay result fro" t&e eli"ination of "i/ed s&ip"ents and
s&ip"ent discrepancies( and t&e reduction of in.entory carrying costs for t&e custo"er. At
t&e sa"e ti"e t&is increases custo"er satisfaction t&roug& i"pro.ed deli.ery relia'ility
and i"pro.ed e@iciencies in recei.ing operations.
Curt&er"ore( t&ere are critical ele"ents to e'usiness "odels as 0ell. ,&ey are as follo0s8
< A shared digital business infrastructure( including digital production and
distri'ution tec&nologies +'road'and/0ireless net0or1s( content creation tec&nologies
and infor"ation "anage"ent syste"s-( 0&ic& 0ill allo0 'usiness participants to create
and utili4e net0or1 econo"ies of scale and scope.
< A sophisticated model for operations, including integrated .alue c&ains'ot&
supply c&ains and 'uy c&ains. < An ebusiness management model, consisting of
'usiness tea"s and/or partners&ips=
< Polic(, regulator( and social s(stems i.e.( 'usiness policies consistent 0it&
eco""erce la0s( tele 0or1ing/.irtual 0or1( distances learning( incenti.e sc&e"es(
a"ong ot&ers.
< Ease of )utomated Processing A payer can no0 c&eaply and easily
auto"ate t&e generation and processing of "ultiple pay"ents 0it& "ini"al e@ort.
$re.iously( t&e dependency upon 'an1s to &andle "ost pay"ents and t&e lac1 of a
c&eap( u'iAuitous co""unications tec&nology "ade auto"ation of pay"ent
processes e/pensi.e and di@icult to esta'lis&.
< Immediac( of result $ay"ent i""ediacy occurs 'ecause auto"ation and t&e
a'ility for t&e inter"ediate syste"s and pro.iders to process pay"ents in realti"e.
%it& t&e "ore "anual( paper'ased syste"s t&ere 0as al0ays a ti"e delay due to t&e
reAuire"ent for &u"an inter.ention in t&e process.
< $penness and accessibilit( ,&e a.aila'ility of c&eap co"puting and
co""unications tec&nology and t&e appropriate soft0are ena'les s"all enterprises
and indi.iduals to access or pro.ide a range of pay"ent ser.ices t&at 0ere
pre.iously only a.aila'le to large organi4ations .ia dedicated net0or1s or t&e
transactional processing units of 'an1s.
< 'oss of collateral information ,&e ne0 tec&nology dispenses 0it&( or alters(
collateral information acco"panying transactions. ,&is infor"ation &as traditionally
'een part of t&e transaction( and &as 'een relied upon 'y t&e transacting parties to
.alidate indi.idual pay"ents.
< !ollateral infor"ation can 'e de*ned as infor"ation8
< %&ic& is not essential to t&e "eaning and intent of a transaction=
< %&ic& is typically incidental to t&e nature of t&e co""unications c&annel o.er
0&ic& t&e transaction is conducted= 'ut ne.ert&eless pro.ides useful conte/tual
infor"ation for one or "ore of t&e parties to t&e transactionE
< !ollateral infor"ation can include "any t&ings ranging fro" tone of .oice in a
telep&one call to t&e 'usiness cards and letter&eads and apparent aut&ority of t&e
person 0it& 0&o" you are dealing.
$age
5
< #lobali*ation Flobali4ation( or t&e "ini"i4ation of geograp&ical factors in "a1ing
pay"ents( &as 'een an o'.ious aspect of t&e ne0 pay"ents syste"s. Its affect is
upon areas suc& as si4e of t&e payments "ar1etplace( uncertainty as to legal
jurisdiction in t&e e.ent of disputes( location and a.aila'ility of transaction trails(
and t&e a'ility of a payment sc&e"e to rapidly adapt to regulatory regi"es i"posed
'y one country 'y "o.ing to anot&er.
< +ew business models Ge0 'usiness "odels are 'eing de.eloped to e/ploit t&e ne0
pay"ent tec&nologies( in particular to address or ta1e ad.antage of t&e
disinter"ediation of custo"ers fro" traditional payment pro.iders suc& as 'an1s. In
t&is conte/t( disinter"ediation is 0&ere t&e tec&nology ena'les a t&ird party to
inter.ene 'et0een t&e custo"er and t&e 'an1ing syste"( e@ectively transferring
t&e custo"er9s trusted relations&ip 0it& t&e 'an1 to t&e ne0 party.
Elements of an e-Business solution
,&e .ision of eBusiness is t&at enterprises 0ill &a.e access to 'road range of trading
partners to interact and colla'orate 0ith and not only 'uy and sell "ore e@iciently. Also(
it is e/pected t&at eBusiness 0ill contri'ute to0ards t&e agility of 'usiness organi4ations
and 0it& t&at to reac&ing &ig&er le.els of custo"i4ation. In t&is 0ay( an organi4ation can
"a/i"i4e supply c&ain e2ciency( impro.e custo"er ser.ice and increase profit "argins.
Hence( t&e need to "a1e mission critical processes8
Inventor(, )ccounting, ,anufacturing and Customer -upport" ,&ese( "ust 'e a'le to
interact 0it& eac& ot&er 'y 'eco"ing 0e'ena'led. ,&is is ac&ie.ed 'y ER$( !!5 and ot&er
syste"s by "a1ing use of distri'uted applications t&at e/tract data and launch 'usiness
processes across "any or all of t&e a'o.e processes.
,&e 1ey ele"ents of an eBusiness solution are8
1. !usto"er Resource
"anage"ent+!R5- 2. Enterprise
resource planning +E!$-
?. Supply C&ain
5anage"ent +S!,- .
Ino0ledge 5anage"ent
5. e5ar1ets
$age
#
Customer relationship management .C!, /
!R5 syste"s are :frontoffice; syste"s 0&ic& &elp t&e enterprise deal directly 0it& its
custo"ers. !R5 +de*nition- is t&e process of creating relations&ips 0it& custo"ers t&roug&
relia'le ser.ice auto"ated processes( personal infor"ation gat&ering( processing and self
ser.ice t&roug& t&e enterprise in order to create .alue for custo"ers.
,&ere are ? categories of user applications under !R5s8
< Customerfacing
applications" Applications
0&ic& ena'le
custo"ers to order
products and ser.ices
< -alesforce facing
applications"
Applications t&at auto"ate
so"e of t&e sales
and salesforce
"anage"ent functions( and
support dispatc& and logistic
functions.
< ,anagementfacing
applications"
Applications 0&ic& gat&er
data fro" pre.ious apps
and pro.ide
"anage"ent reports and
co"pute Return on
relations&ips+RoR- as per
co"pany9s 'usiness "odel
Enterprise !esource Planning .E!P/
ER$s are often called :'ac1o2ce; syste"s. ER$ syste"s are "anage"ent infor"ation
syste"s t&at integrate and auto"ate "any of t&e 'usiness practices associated 0it&
operations or production aspects of a co"pany. ER$ soft0are can aid in control of "any
'usiness acti.ities suc& as sales( deli.ery( production( 'illing( production( in.entory(
s&ipping( in.oicing and accounting.
A typical ER$ syste" is designed around
t&ese pri"ary 'usiness procedures8
<
P
r
o
d
u
c
t
i
o
n
"

"
a
n
u
f
a
c
t
u
r
i
n
g
(

r
e
s
o
u
r
c
e

p
l
a
n
n
i
n
g

and e/ecution process
< Bu(ing a product" procure"ent process
< -ales of a product and
services" custo"er order
"anage"ent process
< Costing,pa(ing bills, and
collecting"
*nancial/"anage"ent
accounting and reporting
process.
-uppl( Chain ,anagement .-C,/
Supply c&ain +de*nition- is a net0or1 of facilities and distri'ution options t&at perfor" t&e
functions of procure"ent of "aterials( transfor"ation of t&ese "aterials into inter"ediate
and finis&ed products( and distri'ution of t&ese *nis&ed products to custo"ers. S!5 deals
0it& t&e planning and e/ecution issues in.ol.ed in "anaging a supply c&ain.
Supply c&ain &as ? "ain parts
< -uppl( side"
concentrates
on &o0( 0&ere
fro"( and 0&en ra0
"aterials are procured
and supplied to
"anufacturing.
< ,anufacturing side"
con.erts ra0 "aterials to
finis&ed products.
< istribution side"
ensures t&at *nis&ed
products reac& t&e *nal
custo"ers t&roug& a
net0or1 of distri'utors(
0are&ouses
and retailers.
0nowledge ,anagement
,&is relates to t&e identification and analysis of a.aila'le and reAuired 1no0ledge assets
and related processes. Ino0ledge assets enco"pass t0o t&ings Infor"ation and E/perience.
Ino0ledge assets co"prise of all 1no0ledge t&at a 'usiness &as or needs to &a.e in order to
generate pro*ts and add .alue.
Ino0ledge "anage"ent includes t&e su'seAuent planning and control of actions to de.elop
'ot& t&e 1no0ledge assets and t&e processes to ful*ll organi4ational o'jecti.es. Ino0ledge is
a strong deno"inator of a 'usiness "odel and deter"ines 'usiness co"petencies especially
0&en uniAue to t&e 'usiness and so "ust 'e 1ept in&ouse.
E-,ar1ets
E5ar1et is an electronic "eeting place for "ultiple 'uyers and sellers pro.iding "any
participants 0it& a uni*ed .ie0 of sets of goods and ser.ices( ena'ling t&e" to transact
using "any different "ec&anis"s. An e5ar1et uses Internet tec&nology to connect "ultiple
'uyers and suppliers.
E-Business !oles and their challenges
< ,&ere are t0o "ain roles in t&e E'usiness scenario8
o ,&e Buyer8 Buyers are organi4ations t&at purc&ase goods and ser.ices directly
fro" Suppliers.
o ,&e Supplier8 Suppliers are organi4ations t&at "ar1et and sell goods or ser.ices
directly to 'uyers or indirectly t&roug& di.erse sales c&annels including %e'
'ased procure"ent syste"s and electronic "ar1etplaces.
< Suppliers typically pro.ide 'uyers 0it& 0e''ased ser.ices necessary for co"pleting e
Business transactions. < Buyers +custo"ers- can t&us re.ie0 product infor"ation(
recei.e custo"er ser.ice( ordering ser.ices and
custo"i4ation support facilities an can su'"it or "odify orders.
< An additional role is t&at of ,ar1et ,a1ers t&at are t&ird party organi4ations t&at run
e"ar1ets.
< Eac& role &as distinct 'usiness and tec&nical c&allenges( 'ut t&ey all coalesce around a
co""on point.
< Cor 'uyers as 0ell as for suppliers( t&e pri"ary c&allenge is t&e a'ility to reac& a
critical "ass of trading partners and transaction .olu"e to sustain t&eir 'usiness.
< Cor suppliers especially( t&e follo0ing c&allenges e/ist8
o 5anaging "ultiple selling c&annels( 'ased on .arious tec&nologies( protocols(
data for"ats( and standard 'usiness processes.
o Ha.ing t&e a'ility to ta1e "ultiple types of orders once t&e custo"er &as
decided to conduct e Business Jena'led order "anage"ent t&roug& t&e .arious
selling c&annels.
o Ha.ing t&e a'ility to di@erentiate and custo"i4e products and ser.ices fro"
ot&er suppliers( and o@ering t&e" t&roug& t&e .arious selling c&annels.
o Ha.ing t&e a'ility to adapt and gro0 t&e eBusiness 0it&out incurring drastic
tec&nology c&anges( organizational restructuring.
o And s0eeping c&anges in t&e 'usiness process( or radical ne0 in.est"ents.
< ,o "eet t&e needs of 'uyers and suppliers( eBusiness strategy and solutions "ust 'e
'uilt on t&e follo0ing 'asic principles8
o E"po0ering suppliers >
'uyers8 Bi@erent
c&annels.
o Ena'ling suppliers of all si4es8
E-Business !e2uirements
< Identif(3measure 2uantifiable business ob4ectives" co"panies "ust accurately
"easure t&e i"pact an e Business initiati.e &as on t&eir 'usiness processes and
decide 0&et&er t&is initiati.e is 0ort& pursuing and &as sustaina'le longter" e@ects
< Ensure organi*ational3operational fle5ibilit(" Enterprises "ust reposition
t&e"sel.es in t&eir "ission( structure and e/ecution to prosper in a su'stantially "ore
dyna"ic en.iron"ent.
< !ethin1 entire compan( suppl( chains" co"panies "ust ret&in1 t&eir entire
supply c&ains in order to opti"i4e perfor"ance and .alue as t&ey see1 to 'etter
integrate 0it& suppliers and custo"ers( s&are infor"ation( interlin1 processes( and
outsource "anufacturing logistics syste"s and "aintenance acti.ities.
< 6ransform the compan( to a processcentric form" !o"panies "ust 'e
conceptuali4ed as a set of 'usiness processes 0it& "ore e"p&asis on "a/i"i4ing
t&e e2ciency of processes rat&er t&an depart"ental or functional units.
< e%ne Business processes" co"panies "ust create "odels of e/isting
processes and interactions deter"ining t&e rele.ant e.ents( ti"e fra"es( resources
and costs associated 0it& 'usiness processes( &ence "a1ing t&e" 0elldefined and
"easura'le
< 7nderstand -ecurit( re2uirements" t&e 'readt& of access and interaction
reAuire"ents of a eBusiness solution reAuires t&e a'ility to pro.ide controlled and
focused access 'y all t&e users.
< )lign business organi*ations with a 8e5ible I6 architecture" in response to
de"ands for end to end e Business solutions( co"panies are e/panding t&eir
applications to include en&anced integration capa'ilities. ,&is includes integration of
'usiness processes at .aried le.els fro" applications and data across+and 0it&in-
organi4ations.
< Establish ubi2uit( within standards" Gone of t&e "any integration tec&nologies
a.aila'le fro" .arious I, .endors &as ac&ie.ed co"plete co.erage. ,&ese do 0or1
0it&in organi4ations 'ut not across glo'al enterprises and 'et0een separate
enterprises. Atte"pts are "ade to esta'lis& open standards for interopera'ility.
< A nu"'er of 'usiness and tec&. dri.en reAuire"ents are co"pelling forces t&at
ena'le successful de.elop"ent > deploy"ent of integrated endtoend eBusiness
applications. So"e of t&ese are8
o E2cient 'usiness process "anage"ent
tec&nology o E2cient '2' co""unication
o E2cient enterprise application integration
tec&nology o
$ther categori*ations view the problem di9erentl(.
A "ore 'asic approac& to .ie0ing eBusiness reAuire"ents is as follo0s8
< 6rust ,&e 'iggest reAuire"ent for running a successful e'usiness is trust. In t&is
age of Cace'oo1 and 5ySpace( online "erc&ants "ay t&in1 t&at pri.acy of a
custo"erKs infor"ation isnKt i"portant( 'ut t&e opposite is true.
o ,&us( 'usinesses "ust 'e trust0ort&y to operate online. !onsu"ers 0ill not
si"ply gi.e t&eir *nancial infor"ation to just anyone( so a site 0ill lose 'usiness
if consu"ers do not feel co"forta'le t&at it is a relia'le( upstanding co"pany.
o !o"panies "ust &a.e co"pre&ensi.e pri.acy policies and stic1 0it& t&e".
Anot&er good idea is to get digital certi*cates and ,RLS,e seals( 0&ic& are
a0arded 'y t&irdparty organi4ations after t&ey researc& t&e legiti"acy of an
online 0e'site.
o Suc& a0ards put consu"ersK "inds at ease. Cinally( e.en if an e'usiness does all
t&is( it "ust also 'e trust0ort&y in t&e sense of ful*lling its pro"ises8 'e up
front 0it& consu"ers a'out pricing and deli.ery ti"es.
< Privac( polic( In addition to t&e 0ay pri.acy la0s apply in t&e )real) 0orld( t&ere are
so"e special t&ings to t&in1 a'out 0&en dealing 0it& t&e Internet and e'usiness.
Mou s&ould fully understand &o0 your 0e'site *ts into pri.acy la0
reAuire"ents.
If your 0e'site collects personal infor"ation( you s&ould de.elop a
proper and legally co"pliant pri.acy policy and post it in a readily .isi'le
location on your 0e'site.
If you use coo1ies or si"ilar "eans to trac1 .isitors( depending on &o0 you
do t&at( you "ay still need to de.elop and post a policy.
7nline pro*ling "ay reAuire t&e consent of t&e indi.idual depending on t&e
circu"stances.
Ieep in "ind t&at people do loo1 for pri.acy policies so( 0it&out a policy(
you "ay lose prospecti.e custo"ers.
A properly drafted pri.acy policy or state"ent 0ill not only "ini"i4e your
legal e/posure( it can ser.e a "ar1eting function as 0ell( allo0ing you to
attract and retain custo"ers 0&o ot&er0ise "ig&t not 'e as inclined to
deal 0it& you.
Bo not create a policy and t&en fail to follo0 it precisely. ,&is is an
in.itation for disaster( including not only possi'le legal pro'le"s( 'ut also
injury to your reputation and good0ill.
o It is i"portant to not just let t&e policy sit once it &as 'een posted. It s&ould 'e
re.isited regularly to deter"ine 0&et&er or not it is still accurate and to
e.aluate 0&et&er or not it s&ould 'e re.ised to assist you in your 'usiness goals
and o'jecti.es.
< -trateg( Eco""erce "erc&ants "ust also &a.e a strategy to succeed in t&e
online "ar1etplace. 5any people start 0e'sites 'ecause t&ey t&in1 it is a Auic1 and
easy 0ay to "a1e cas&( 'ut in fact it ta1es a "uc& greater in.est"ent t&an "ost
people e/pect.
< ,&erefore( 'efore launc&ing a site( 'usinesses "ust &a.e strategies to &andle
issues large and s"all8 o Ho0 consu"ers 0ill place orders(
o Ho0 deli.eries 0ill 'e "ade(
o Ho0 custo"er ser.ice issues 0ill 'e &andledE
o 5ore 'roadly( &o0 "uc& do o0ners e/pect to earn o.er a certain period( &o0
0ill consu"ers *nd t&e site( and &o0 0ill success 'e judged.
o 7nline "erc&ants 0it&out strategies 0ill soon 'e o.er0&el"ed 'y suc& issues.
< -uitabilit( Cinally( "erc&ants "ust decide if t&eir products are suita'le for t&e
0e'. ReAuire"ents for successful e'usinesses concern t&e goods and ser.ices
t&e"sel.es8
o !an t&ey 'e deli.ered Auic1ly and c&eaplyE
o Bo t&ey appeal to people outside a s"all
geograp&ic areaE o %ill going online sa.e "oneyE
o %ill t&e 'enefits out0eig& t&e costsE
<
6echnolog ical
!e2uirem ents"
o )chieving
!eal 6ime
Fle5ibilit(" In
t &eory( digital
t&ings are
e asier to
c&ange t&an
p&ysical
t&ings. It is
faster to edit
a "e"o using a
0ord
p rocessor t&an
a type0riter +and you don9t get in1 on your
*ngers-.
o But 0&en progra""ing is reAuired to c&ange content or access policies(
"aintaining a co"ple/ %e' site can range fro" onerous to i"possi'le. 5ar1et
factors c&ange in real ti"e( and so "ust t&e logic and content of an eBusiness
site.
o ,o ac&ie.e t&is .ision( t&e ne/t generation of eBusiness syste"s "ust pro.ide
a fra"e0or1 for auto"ated infor"ation e/c&ange 'et0een all t&e sta1e&olders
in a 'usiness.
o ,&ese ne0 fra"e0or1s are designed for 3e/i'ility so co"panies can c&ange
content and 'usiness logic in realti"e to "eet c&anging 'usiness needs and
"ar1et conditions.
o ,&is adapta'ility co"es fro" a set of core ser.ices( co""on to all applications(
0&ic& ena'le rapid deploy"ent of ne0 applications and ne0 infor"ation and
0&ic& 0or1 toget&er to create a co"pelling( uni*ed eBusiness en.iron"ent.
o An eBusiness fra"e0or1 "ust include pac1aged( readytodeploy ser.ices for8
o )n )rchitecture For eBusiness As eBusiness "o.es 'eyond si"ple
transactions to enco"pass all t&e co"ple/ processes t&roug& 0&ic& a
co"pany pro.ides .alue( infor"ation syste"s "ust orc&estrate t&e function
of enterprise applications and infor"ation resources for total infor"ation 3o0.
And t&ey "ust e"po0er 'usiness people 0it& t&e tools to "anage content
pu'lis&ing( deli.ery( and access( so t&at 'usiness results don9t depend on t&e
I, depart"ent9s progra""ing 'ac1log.
o 6hree 6ier $b4ect Centered esign ,o ac&ie.e true( realti"e e
!o""erce( ne/tgeneration e Business syste"s "ust 'e 'uilt around a ?tier
application paradig" 0it& a clear a'straction and true separation of user
interface presentation( 'usiness logic( and content. Separation and
a'straction of t&ese layers is ac&ie.ed t&roug& t&e use of 'usiness o'jects(
particularly in t&e "iddle layer.
o %&en separating presentation( application( and Bata Dogic t&ree t&ings "ust 'e
considered8
< 7ser Interface ,&e user interface "ust support a .ariety of
interface "ec&anis"s( including %e' 'ro0sers for users( 'usiness
"anagers( designers and des1top applications for de.elopers.
< Business 'ogic ,&e "iddle tier "ust not only i"ple"ent and e/ecute
'usiness logic( it "ust also pro.ide t&e fra"e0or1 of ser.ices t&at ena'le
eBusiness( including security ser.ices(
transaction ser.ices( and cac&ing( pooling( and ot&er load 'alancing
ser.ices to i"pro.e o.erall syste" perfor"ance.
< Content ,&e content layer includes corporate data'ases( docu"ent
stores and ot&er 1no0ledge repositories
o )ll $b4ects )re +ot Created E2ual ,&e o.erall arc&itecture of an eBusiness
syste" is i"portant( 'ut proper a'stractions ac&ie.ed t&roug& o'ject tec&nology
are t&e foundation of a 3e/i'le eBusiness syste". !orrect separation of
presentation( 'usiness logic( security functions( and content deter"ines t&e
3e/i'ility of t&e syste" and t&e
pace and [email protected] of e
Business processes. ,o
deli.er truly dyna"ic( realti"e
co""unication( t&ese relations&ips
"ust 'e esta'lis&ed on a
pertransaction 'asis( as eac&
page is asse"'led for deli.ery to a
user eBusiness processes lend
t&e"sel.es to t&is 1ind of
a'straction.
o Bringing $rder 6o Content
,anagement As co"panies
"o.e "ore of t&eir 'usiness
processes onto t&e %e' in searc& of
greater sales or e2ciency( %e' sites are gro0ing in si4e and co"ple/ity. Static %e'
sites often consist of &undreds or e.en t&ousands of %e' pages( and tens of
t&ousands of lines of code. 5ulti"edia sites are 'eco"ing t&e standard( 0it&
e.eryt&ing fro" sop&isticated grap&ics and ani"ation to audio and .ideo.
Enterprise %e' sites "ust integrate "ultiple applications fro" t&e 'ac1 o2ce to
t&e supply and sales c&ain( 0&ile "aintaining security and t&e integrity of 'usiness
infor"ation. As sites 'eco"e larger and "ore co"ple/( traditional %e' pu'lis&ing
syste"s( 0it& t&eir &ardcoded %e' page content( 'eco"e un"anagea'le.
!ontent creators s0a"p progra""ers 0it& reAuests for ne0 %e' pages( t&e
appro.al process 'ogs do0n( and users no longer &a.e access to current content.
o (namic :eb Environment ,&e grap&ical layouts used in ne/tgeneration e
Business syste"s are "ore :intelligent; and "anagea'le t&an t&e te"plates used in
traditional %e' pu'lis&ing. %&ile 'ot& control place"ent of grap&ic ele"ents( style(
etc.( te"plates access content t&roug& 'usiness logic &ardcoded into t&e 'ody of
t&e page. $ages 0it& di@erent content( &o0e.er si"ilar( reAuire di@erent source
files. 'a(outs( on t&e ot&er &and( are a ne/tgeneration approac& t&at does not
e"'ed 'usiness logic in t&e presentation o'jects. A layout controls only style and
place"ent of ele"ents on t&e page. ,&e logic t&at deter"ines content is separate
fro" t&e layout( and can 'e c&anged and "aintained independently.
o Content ,anagement 6ools !ontent 5anage"ent tools "ust ena'le content to
gro0 and c&ange at %e' speed.
o 6eam Content evelopment $u'lis&ing content to t&e %e' and e/tending t&e
functionality of a site ta1es a 0&ole tea"8 de.elopers to 'uild site structure and
i"ple"ent 'usiness rules( designers to create page layouts and de*ne a consistent
loo1 and feel( and 'usiness "anagers to de*ne 'usiness rules and contri'ute
content.
o Collaboration )cross 6he E5tended Enterprise $u'lis&ing and "anaging %e'
content typically in.ol.es an appro.al process and so"e ad"inistrati.e 0or1.
o Centrali*ed !ules Based Content ,anagement" Anyone s&ould 'e a'le to
"anage site content si"ply 'y de*ning a fe0 docu"ent c&aracteristics 0&en a
docu"ent is pu'lis&ed. %it& c&aracteristics suc& as a docu"ent9s type +for
e/a"ple( :data s&eet;-( for"at( and acti.ation/e/piration dates in place( lin1s to
t&e docu"ent can 'e auto"atically populated t&roug&out t&e site( and
docu"ent .isi'ility and :docu"ent "igration; can 'e auto"ated.
o Customi*ed Content
eliver( o Pervasive
Personali*ation
o 0nowledgeBased Personali*ation Effecti.e personali4ation depends on t&e
a'ility to custo"i4e a user9s e/perience 'ased on a ric&( centrally stored user
pro*le8 in essence( a 1no0ledge 'ase t&at consists of user infor"ation and
e/pertise on &o0 to apply t&at infor"ation. ,&is 1ind of 1no0ledge 'ase cannot
'e 'olted onto a 'roc&ure 0are %e' site. ,&e a'ility to gat&er and apply user
related 1no0ledge "ust 'e integrated into t&e eBusiness syste" fro" day one(
so t&at infor"ation can 'e contri'uted( s&ared( and le.eraged 'y all t&e
applications in t&e syste".
o 7ser pro%les are t&e cro0n je0els of an eBusiness strategy. ,&e Auality of
pro*les deter"ines t&e degree to 0&ic& t&e user e/perience can 'e personali4ed.
$ro*les can and s&ould 'e 'uilt t&roug& 'ot& e/plicit and i"plicit "ec&anis"s.
o Inclusive -ecurit(
o -calabilit( 6o Compete As "ore and "ore processes are adapted to eBusiness(
a %e' site "ay gro0 to support t&ousands of users( "illions of docu"ents and
"illions of transactions eac& day. An e Business syste" "ust &a.e t&e po0er to
perfor" fast and relia'ly( as a 'usiness gro0s( 0&ile deli.ering t&e dyna"ic(
personali4ed content necessary to ac&ie.e 'usiness goals.
o Enterprise Integration and 6ransaction ,onitors Any 'usiness process can 'e
:%e'ified; 0it& a !FI interface or a fe0 ser.er pages. But isolated( eBusiness is
a'out pro.iding ne0 .alue 'y doing 'usiness in a funda"entally ne0 0ay.
Integration is t&e goal and t&e &eart of eBusiness8 integrating and e/posing
applications and content in a personali4ed 0ay to speed( scale and i"pro.e
'usiness processes and to engage( in.ol.e( and 'uild lasting relations&ips 0it&
custo"ers and 'usiness partners.
o 6ransaction management guarantees t&at users &a.e a consistent .ie0 of
'usiness infor"ation. Cor e/a"ple( t&e eBusiness syste" s&ould pre.ent t&e
custo"er fro" co"pleting an order 'ased on one price( t&en 'eing c&arged 'ased
on t&e ne0 price. ,ransaction "anage"ent also pre.ents inaccurate results 'ased
on syste" failures +e.g.( t&e syste" goes do0n and loses an order 'ut continues to
process t&e 'illing using alreadytrans"itted credit card infor"ation-. Ro'ust
syste" logs can &elp coordinate updates across "ultiple data sources fro"
"ultiple .endors or roll 'ac1 c&anges in case of syste" failure.
o elegated -(stem ,anagement eBusiness syste"s are distri'uted 'y t&eir
.ery nature( coordinating infor"ation s&aring a"ong applications( 'usiness
functions and depart"ents( and partners up and do0n t&e supply c&ain. Bringing
'usiness processes to t&e %e' increases t&e co"ple/ity of t&e eBusiness site( and
gro0ing and c&anging nu"'ers of users and applications increase t&e co"ple/ity of
"anaging a site. Go centrali4ed I, depart"ent could effecti.ely "aintain current
accounts or access pri.ileges for all users( inside and outside t&e co"pany. 5ost
%e' sites today are not sop&isticated enoug& to reac& t&is road'loc1( 'ut as
'usinesses open and e/tend t&eir processes .ia t&e %e'( syste" "anagea'ility
0ill 'eco"e an increasingly serious issue.
o 6ime to ,ar1et ;,i"e to "ar1et "ust 'e "ini"al as delays "ay result in
losing t&e 'ene*t of e Business integration.
Impacts of e-Business
< Improved operational ef%cienc( and productivit(" 'y eli"inating operational
0aste and auto"ation of ine2cient 'usiness practices( organi4ations can reali4e
producti.ity gains
< !eduction in operating costs and costs of goods and services" 'y connecting
directly 0it& suppliers and distri'utors( organi4ations can reali4e "ore e2cient
processes t&at result in reduced units of cost for products or ser.ices and lo0er
prices to custo"ers 0&ile ac&ie.ing econo"ies of scale.
< Improved competitive position" glo'al reac&( rapid gro0t&( e2cient reduction of
product ti"e to "ar1et and opti"i4ation of product distri'ution c&annels all contri'ute
to superior co"petiti.e position.
< Penetration into new mar1ets through new channels" 0it& eBusiness location is
of no conseAuence 0&en it co"es to reac&ing custo"ers.
< Improved communication, information and 1nowledge sharing" align"ent of 1ey
supply c&ain partners 0it& an organi4ation9s internal strategies &elps e/ploit t&eir
e/pertise and 1no0ledge( &ence creating opportunity to secure longter" 'usiness 'y
e"'edding t&eir process and procedures in t&ose of t&eir custo"ers9 supply c&ains.
< <armoni*ation and
standardi*ation of process <
Improved internal information
access
< Improved relationships with suppliers and improved customer service
In&i'itors of eBusiness
<
,anagement3-trat
eg( issues o e
'usiness strategy
o 7rgani4ational c&anges reAuired 'y e'usiness
o 5anage"ent attitudes and organi4ational
in3e/i'ility < Cost3financing issues
o !osts of i"ple"entation
o !alculating t&e Return on
In.est"ent +R7I- < -ecurit( and 6rust
Issues
< 'egal Issues
o Ce0 co"panies are fa"iliar 0it& t&e rules and regulations t&at apply to an
online en.iron"ent. ,&is leads to Lncertainty.
o Bi@erent stro1es for di@erent
fol1sN < 6echnological Concerns
o Integration Issues
< )rguments against
Investment o
Lncertainty > Cear
$age
1
E-Business -trategies
:hat is an E-Business -trateg(=
< EBusiness &as triggered ne0 'usiness "odels( strategies and tactics t&at are "ade
possi'le 'y t&e internet and ot&er related tec&nologies.
< In order to co"pete in t&e "ar1etplace( it is essential for organi4ations to esta'lis&
strategies for t&e de.elop"ent of an e'usiness.
< EBusiness strategy can 'e .ie0ed .ia t0o different .ie0points( 0&ic& are
e/plained 'elo0. < 7ne .ie0 defines strategy as plans and o'jecti.es
adopted to ac&ie.e &ig&erle.el goals.
< In t&at sense( a strategy is de.eloped to ac&ie.e a goal li1e i"ple"enting
organi4ational c&ange( or a large soft0are pac1age suc& as an ER$syste".
< Strategy "ay also relate to plans concerning t&e longter" position of t&e fir" in its
'usiness en.iron"ent to ac&ie.e its organi4ational goals.
< Based on t&e a'o.e( 0e arri.e at a co""on definition for an eBusiness Strategy.
< An eBusiness strategy is t&e set of plans and o'jecti.es 'y 0&ic& applications of
internal and e/ternal electronically "ediated co""unication contri'ute to t&e
corporate strategy.
< Strategic planning co"prises a distinct class of decisions +a plan is a set of decisions
"ade for t&e future- and o'jecti.es( and &as to 'e positioned ne/t to tactical
planning +structuring t&e resources of t&e fir"- and operational planning +"a/i"i4ing
t&e pro*ta'ility of t&e current operations-.
< Strategy is concerned 0it& c&anges in t&e co"petiti.e en.iron"ent t&at "ay trigger
strategic c&anges for t&e indi.idual *r" and so a@ect its roles and functions in t&e
"ar1et.
< Reassess"ent of strategy "ay
occur due to8 o Ge0 $roducts
o !&anging custo"er preferences
Clo0ers8 Roses / !arnations O 7rc&ids
A fe0 years 'ac1 0&en people 0ent to t&e florist( t&ey generally
pic1ed up Roses or !arnations etc. Go0( t&ey prefer 7rc&ids. ,&is is
an e/a"ple of c&anging custo"er preferences. A glo'al notion is t&at a
custo"er does not reali4e t&e utility of feel t&e need for a product until it
is offered to &i" / &er.
o !&anging de"and
patterns o Ge0
co"petitors
< ,&e freAuency( dyna"ics and predicta'ility of t&e a'o.e c&anges dictate t&e
intensity of t&e strategic planning acti.ity of t&e fir".
< So( eBusiness strategy +re.ised- is8
o ,&e set of plans and o'jecti.es 'y 0&ic& applications of internal and e/ternal
electronically "ediated co""unication contri'ute to t&e corporate strategy.
< EBusiness strategy "ay 'e i"ple"ented for8
o ,actical purposes8 5ail O EBI
OP5DCBI o Ac&ie.ing corporate
strategy o'jecti.es
< EBusiness is strategic in nature.
o ,&e idea is to create a prefera'ly sustaina'le > co"petiti.e position for t&e
co"pany.
,&is is ac&ie.ed 'y integration of t&e Internet and related tec&nologies
in its pri"ary processes.
< EBusiness "ust not only support corporate strategy o'jecti.es 'ut also
functional strategies +S!5( 5ar1eting-
< -uppl( Chain ,anagement -trateg(
$age
15
o Based on .alue c&ain analysis for deco"posing an organi4ation into its
indi.idual acti.ities and deter"ining .alue added at eac& stage.
o Fauge e2ciency in use of resources at
eac& stage. < ,ar1eting -trateg(
o Is a concerned pattern of actions ta1en in t&e "ar1et en.iron"ent to create
.alue for t&e fir" 'y i"pro.ing its econo"ic perfor"ance.
o Cocused on capturing "ar1et s&are or i"pro.ing profita'ility .ia
'rand'uilding etc. o 7perates on !LRREG, AS %EDD AS CL,LRE
projections of custo"er de"and.
< Information -(stems -trateg(
o Ho0 to le.erage infor"ation syste"s in an organi4ation to support t&e o'jecti.es
of an organi4ation in t&e long run.
< EBusiness strateg( is based on corporate ob4ectives.
-trategic Positioning
Strategic positioning "eans t&at a *r" is doing t&ings di@erently fro" its co"petitors in a 0ay
t&at deli.ers a uniAue .alue to its custo"ers. ,&ere are # funda"ental principles a *r" "ust
follo0 to esta'lis& and "aintain a distincti.e strategic position8
1. Start 0it& t&e rig&t goal8 superior long ter" R7I.
2. Strategy "ust ena'le it to deli.er a .alue proposition different
fro" co"petitors. ?. Strategy "ust 'e re3ected in a distincti.e
.alue c&ain.
. Accept tradeo@s for a ro'ust strategy.
5. Strategy "ust define &o0 all ele"ents of 0&at a *r"
does fit toget&er. #. Strategy "ust in.ol.e continuity of
direction.
'evels of e-Business -trategies
Strategies 0ill e/ist at different le.els of an organi4ation. Strategic le.els of "anage"ent
are concerned 0it& integrating and coordinating t&e acti.ities of an organi4ation so t&at
t&e 'e&a.ior is opti"i4ed and its o.erall direction is consistent 0it& its "ission. Llti"ately
eBusiness is a'out co""unication( 0it&in 'usiness units and 'et0een units of t&e
enterprise as 0ell as organi4ations.
1- -uppl( Chain or Industr( >alue Chain level
< EBusiness reAuires a .ie0 of t&e role( added .alue( and position of t&e *r"
in t&e supply c&ain. < I"portant issues t&at need to 'e addressed at t&is le.el
are8
i. %&o are t&e *r"9s direct custo"ersE
ii. %&at is t&e fir"9s .alue proposal to t&e
custo"ersE iii. %&o are t&e suppliersE
i.. Ho0 does t&e *r" add .alue to t&e suppliersE
.. %&at is t&e current perfor"ance of t&e Supply !&ain in ter"s of re.enue
and pro*ta'ility( in.entory le.els etcE
.i. 5ore i"portantly( 0&at are t&e reAuired
perfor"ance le.elsE .ii. %&at are t&e current
pro'le"s in t&e c&ainE
< ,&is sort of analysis gi.e insig&t into in upstrea" +supplier side- and
do0nstrea" +custo"er side- data and infor"ation flo0s.
2- 6he 'ine of Business or .-trategic/ Business 7nit level
< Lnderstanding t&e position in t&e .alue c&ain is a starting point for furt&er
analysis of &o0 Internet related tec&nologies could contri'ute to t&e co"petiti.e
strategy of a 'usiness.
< ,&is is t&e le.el 0&ere co"petiti.e strategy in a particular "ar1et for a
particular product is de.eloped +Strategic $ositioning-.
< ,&ere are four generic strategies for ac&ie.ing a profita'le 'usiness8
i. i9erentiation" ,&is strategy refers to all t&e 0ays producers can "a1e
t&eir product uniAue and distinguis& t&e" fro" t&ose of t&eir co"petitors.
ii. Cost" Adopting a strategy for cost co"petition "eans t&at t&e co"pany
pri"arily co"petes 0it& lo0 cost= custo"ers are interested in 'uying a
product as ine/pensi.ely as possi'le. Success in suc& a "ar1et i"plies
t&at t&e co"pany &as disco.ered a uniAue 'usiness "odel 0&ic& "a1es it
possi'le to deli.er t&e product or ser.ice at t&e lo0est possi'le cost.
iii. -cope" A scope strategy is a strategy to co"pete in "ar1ets 0orld0ide(
rat&er t&an "erely in local or regional "ar1ets.
iv. Focus" A focus strategy is a strategy to co"pete 0it&in a narro0 "ar1et
seg"ent or product seg"ent.
?/ 6he Corporate or Enterprise level
< ,&is le.el co"prises a collection of +strategic- 'usiness units.
< ,&is le.el addresses t&e pro'le" of synergy t&roug& a *r"0ide( a.aila'le I,
infrastructure.
< !o""on eBusiness applications t&roug&out t&e organi4ation are needed for t0o
'asic reasons.
< Cro" e2ciency point of .ie0( &a.ing di@erent applications for t&e sa"e
functionality in di@erent areas of 'usiness is needlessly costly.
< Cro" an [email protected] point of .ie0( t&ere is t&e need for cross Dine of Business
co""unication and s&area'ility of data.
< ,&e e"p&asis in t&e 'usiness plans is on t&e custo"er( not
t&e *nal product. < ,&ese all 'eco"e su'jects of an
enterprise0ide eBusiness policy.
6he changing competitive )genda" Business &
6echnolog( rivers Business rivers"
< S&ift in econo"ies fro" supply dri.en to de"and dri.en
o !auses a s&ift in intent of ser.ice and Auality progra"s( t&e i"petus for product
de.elop"ent > t&e structure of t&e organi4ation itself
o 7ne to 7ne
"ar1eting o 5ass
!usto"i4ation
6echnological rivers"
< Internet
o $er.asi.eness
o Interacti.e
Gature o 6irtual
Gature
-trategic Planning Process
< ,&e strategic planning process &as t&e follo0ing steps8
< ,&e strategic planning process starts 0it& t&e esta'lis&"ent of t&e organi4ation9s
"ission state"ent.
o ,&e "ission state"ent is a 'asic description of detailing t&e funda"ental
purpose of t&e organi4ations e/istence and enco"passes strategy
de.elop"ent( including deter"ination of t&e organi4ation9s .ision and
o'jecti.es.
o It is de.eloped at t&e &ig&est le.el of t&e organi4ations "anage"ent( and
pro.ides a general sense of direction for all decision "a1ing 0it&in t&e *r".
< Strategic Analysis
o ,&is in.ol.es situation analysis( internal resource assess"ent( and e.aluation
of sta1e&older9s e/pectation.
$age
1Q
o It 0ill include
En.iron"ental Scanning
Industry or "ar1et
researc& !o"petitor
Analysis
Analysis of 5ar1etplace Structure
Relations&ips 0it& trading partners and
suppliers !usto"er 5ar1eting Researc&
o Infor"ation is deri.ed fro" t&e analysis of 'ot& internal and
e/ternal factors. o Internal Cactors8
Hu"an
resources
5aterial resources
Infor"ational
resources Cinancial
resources
Structure
7perational
Style !ulture
o E/ternal Cactors8
Sociocultural
forces
,ec&nological
forces
Degal and regulatory
forces $olitical forces
Econo"ic forces
!o"petiti.e forces
o Any realistic ne0 plan 0ill &a.e to re3ect t&e reality of 'ot& t&e e/ternal 0orld
and t&e internal dyna"ics of t&e corporation.
< Strategic !&oice
o It is 'ased on t&e strategic analysis and consists
of four parts8 Feneration of strategic options
Hig&lig&ting possi'le courses of action
E.aluation of strategic options on t&eir
relati.e "erits Selection of strategy
o Strategic c&oice results in Strategic $lanning( 0&ic& is concerned 0it& t&e
organi4ing and detailing of all t&e strategies t&at 0ill 'e underta1en t&roug&out
t&e organi4ation.
o $lanning includes strategy specification and resource allocation.
o It co""ences 0it& corporatele.el planning t&at deter"ines t&e o.erall
direction for t&e organi4ation.
o ,&is dri.es Bi.ision +or Strategic Business Lnit- le.el planning 0&ic& deals 0it&
groups of related products o@ered 'y t&e organi4ation.
o ,&ese plans in turn 'eco"e t&e starting point for operating +or functional-
le.el planning( 0&ic& in.ol.es "ore local plans 0it&in specific depart"ents of t&e
organi4ation.
< I"ple"entation
o ,&is relates to t&e actual tas1s t&at "ust 'e e/ecuted in order to reali4e a
plan and translates strategy into action.
o It includes "onitoring( adjust"ent( control as 0ell as feed'ac1.
$age
1R
-trategic )lignment
< In t&e 1SR0s t&e concept of align"ent 'et0een 'usiness and I, 0as de.eloped.
< According to t&is concept it is not only feasi'le to design and 'uild a tec&nically
sop&isticated infrastructure for eBusiness( 'ut also to for"ulate 'usiness strategies
t&at co"ple"ent and support t&is infrastructure.
< 7ne of t&e "ajor issues regarding an enterprises in.est"ent in I, is 0&et&er t&is is
in &ar"ony 0it& its strategic o'jecti.es.
< ,&is state of &ar"ony is referred to as align"ent.
< Align"ent is co"ple/( "ultifaceted and al"ost ne.er co"pletely ac&ie.ed. It is a'out
continuing to "o.e in t&e rig&t direction and 'etter aligned t&an t&e co"petitors.
< Any eBusiness strategy s&ould articulate an enterprise9s intention to use infor"ation
tec&nology 'ased on 'usiness reAuire"ents.
< %&en for"ulating t&e I, strategy( t&e enterprise
"ust consider8 o Business o'jecti.es and t&e
co"petiti.e en.iron"ent
o !urrent and future tec&nologies and t&e costs( ris1s( and 'enefits t&ey can 'ring
to t&e 'usiness.
o ,&e capa'ility of t&e I, organi4ation and tec&nology to deli.er current and future
le.els of ser.ice to t&e 'usiness.
o !ost of current I,( and 0&et&er t&is pro.ides su@icient .alue to
t&e 'usiness. o Dessons learned fro" past failures and
su
cc
es
se
s.
Conse2uences of e-Business
< As eBusiness is an infor"ation tec&nologyena'led organi4ational p&eno"enon
0it& econo"ic conseAuences( econo"ic t&eories appear to 'e particularly useful for
analy4ing t&e 'usiness e@ects.
< Strategy is a'out *nding t&e rig&t +e/ternal- *t 'et0een organi4ation and
en.iron"ent. Bi@erent sc&ools of t&oug&t &a.e approac&ed t&is pro'le" fro" di@erent
angles.
< %&en analy4ing t&e 'usiness e@ects of an eBusiness( 0e 0ill consider t&e
follo0ing approac&es8 o ,&e ,&eory of !o"petiti.e Strategy
o ,&e resource'ase .ie0
o ,&e t&eory of transaction
costs < 6heor( of Competitive
-trateg(
o ,&e structural attracti.eness of a fir" is deter"ined 'y fi.e underlying
forces of co"petition8 ,&e 'argaining po0er of t&e custo"ers
,&e 'argaining po0er of t&e suppliers
,&e 'arriers to entry for ne0 co"petitors
,&e t&reat of ne0 su'stitute products or ser.ices
,&e co"petition a"ong e/isting fir"s in t&e industry
o In co"'ination( t&ese forces deter"ine &o0 t&e econo"ic .alue created 'y
any product( ser.ice tec&nology or 0ay of co"peting is di.ided 'et0een
co"panies in an industry.
o ,&e 'argaining po0er of custo"ers for a *r" could( for instance( depend on t&e
degree of product di@erentiation( and t&e si4e of de"and and supply. S0itc&ing
costs are also .ery i"portant8 t&ey ans0er t&e Auestion of &o0 "uc& 0ill it cost
t&e custo"er to c&ange to anot&er supplier.
o ,&e 'argaining po0er of suppliers is dependent on a .ariety of factors( suc& as
relati.e si4e( nu"'er of suppliers( t&at can deli.er a critical resource( and so
on. ,&e Internet causes anot&er speci*c t&reat fro" t&e perspecti.e of I,
suppliers= t&ey "ay 'ypass t&eir custo"ers and directly approac& t&e enduser.
o ,&e 'arriers to entry for ne0 co"petitors depend on &o0 difficult it is to join t&e
industry. Econo"ic and tec&nological t&res&olds "ay pre.ent outside potential
co"petitors to co"e in . Econo"ies of scale( necessary capital( and speciali4ed
e/pertise are i"portant factors in t&is respect.
o ,&e t&reat of su'stitute products depends on t&e Auestion of 0&et&er ot&er
products can deli.er added .alue for consu"ers instead of current products in
t&e a'sence of s0itc&ing costs. e.g. J ,&e Internet is a serious t&reat to t&e $ost
7@ice.
o ,&e le.el of co"petition a"ong e/isting *r"s in t&e industry 0ill depend on
.arious factors li1e type of "ar1et( e/isting co"petiti.e 'e&a.ior( and so on.
< 6he !esourceBased >iew
o According to t&is t&eory of econo"ic de.elop"ent( inno.ation is t&e source
of .alue creation. o Se.eral sources of inno.ation +&ence( .alue creation- are
identi*ed8
,&e introduction of ne0 goods or ne0 production
"et&ods( ,&e creation of ne0 "ar1ets(
,&e disco.ery of ne0 supply
sources( And t&e
reorgani4ation of industries.
o ,&e resource'ased .ie0 +RB6-( 0&ic& 'uilds on t&e t&eory of econo"ic
de.elop"ent9s perspecti.e on .alue creation( regards a *r" as a collection of
resources and capa'ilities.
o ,&e RB6 loo1s at a.aila'le resources *rst to see &o0 a position in t&e 'usiness
en.iron"ent can 'e acAuired 0it& t&e".
o According to t&is .ie0( a *r" can 'uild a strategic position 'y pic1ing t&e
rig&t resources and 'uilding co"petencies t&at are uniAue and di@icult to
i"itate.
o Resources are considered t&e ra0 "aterial for 'uilding co"petencies.
o ,&e RB6 states t&at "ars&alling and uniAuely co"'ing a set of
co"ple"entary and speciali4ed resources and capa'ilities "ay lead to .alue
creation.
o A fir"9s resources and co"petencies are .alua'le if( and only if( t&ey reduce
a fir"9s costs or i"pro.e its re.enues.
o !ore co"petencies of an organi4ation enco"pass 1no0ledge 'ases( s1ill sets(
and ser.ice acti.ities t&at can create a continuing co"petiti.e ad.antage.
< 6ransaction Cost Economics
o ,ransaction !ost Econo"ics atte"pt to e/plain fir"s9 c&oices 'et0een
internali4ing and 'uying goods and ser.ices fro" t&e "ar1et.
o According to transaction cost t&eory( e/c&anges 0it& e/ternal *r"s entail a
.ariety of coordination costs associated 0it& .arious aspects of inter*r"
transactions.
o ,&e central Auestion addressed 'y transaction cost econo"ics is 0&y *r"s
internali4e transactions t&at "ig&t ot&er0ise 'e conducted in "ar1ets. ,&us( t0o
1ey issues concerning *r"s are8
%&ic& acti.ities s&ould a fir" "anage 0it&in its 'oundaries( and 0&ic&
acti.ities s&ould it outsourceE
In 0&ic& 0ay s&ould a *r" "anage its relations&ip 0it& its custo"ers(
suppliers and ot&er 'usiness partnersE
o According to transaction cost econo"ics( a *r" &as t0o options for
organi4ing its econo"ic acti.ities8 an internal &ierarc&ical structure 0&ere it
integrates t&e acti.ity into its "anage"ent structure( or a "ar1etli1e
relations&ip 0it& e/ternal fir"s.
o !ritical di"ensions of transactions in3uencing t&e c&oice of t&e "ost [email protected]
go.ernance for" are8 Lncertainty
E/c&ange CreAuency
Speci*city of Assets ena'ling t&e e/c&ange
o ,ransaction costs include t&e costs of planning( adapting( e/ecuting and
"onitoring tas1 co"pletion. o ,ransaction cost ,&eory assu"es t&at "ar1ets are
not perfect( so lead to costs( li1e searc& and
"onitoring costs.
o As internet tec&nology is e/pected to signi*cantly reduce transaction costs( t&is
t&eory pro.ides a 'asis for assessing t&e e@ects of t&e Internet on ne0 and
e/isting 'usiness "odels.
-uccess factors for Implementation of e-Business -trategies
< ,ransfor"ing an enterprise fro" a traditional organi4ation to an e'usiness 'ased
organi4ation9s a co"ple/ endea.or.
< It is essential t&at senior "anage"ent de.elops and endorses a 'road strategic .ision.
< 7nce t&e strategy &as 'een deter"ined and appro.ed t&e i"ple"entation
strategy &as to 'e c&osen. < ,0o approac&es pre.ail8
o ,&e topdo0n approac&8 According to t&is( 'usiness transfor"ation is a
'usinessT0ide p&eno"enon t&at can only 'e i"ple"ented 'usiness 0ide.
o ,&e 'otto"up approac&8 In t&is approac&( 'usiness reengineering starts as an
e/peri"ent in an inconspicuous part of an organi4ation. Dessons are learnt fro"
t&is e/peri"ent( and t&e 1no0ledge is transferred to ot&er parts of t&e
organi4ation.
< Alt&oug& t&e 'otto"up approac& &as strong support ( especially in t&e case of
inno.ation( central co ordination of t&e transfor"ation acti.ity is "andatory.
< ,o pro.ide for t&e central coordination( progra" "anage"ent &as to 'e instituted. A
core part of progra" "anage"ent is "ultiproject "anage"ent ( t&e "ain o'jecti.es of
0&ic& are8
o Recogni4e dependencies 'et0een projects
o S&are scarce resources in an o.erall e2cient 0ay
o Syste"atically utili4e e/periences fro"
single projects < $rogra" "anage"ent is
c&aracteri4ed 'y8
o $rogra"
organi4ation( o
$olicies(
o $lans(
o
!o""unication(
o Align"ent.
< Deading a c&ange project or 'usiness0ide initiati.e reAuires persons t&at plan t&e
c&ange and 'uild 'usiness0ide support= t&ese are called Uc&ange agents9.
< !&ange Agents are part of t&e progra" "anage"ent organi4ation.
< In principle( e.eryone in.ol.ed in a c&ange project can assu"e t&e role
of a c&ange agent. < ,&ree types of c&ange agent roles &a.e 'een
identified8
o 6raditional" In t&e traditional "odel( t&e Infor"ation Syste" +IS- specialists
focus on t&e deli.ery of t&e i"ple"entation of t&e tec&nology( 0it&out
considering t&e organi4ational aspects. !onseAuently t&ey 'eco"e tec&nicians
0it& a narro0 area of e/pertise.
o Facilitator" In t&e facilitator "odel( t&e central "odel is t&at people( not
tec&nologies create c&ange. ,&e c&ange agent 'rings toget&er all t&e
conditions necessary for t&e c&ange. In t&is "odel( t&e c&ange agent re"ains
Uneutral9( t&e organi4ation is responsi'le for t&e c&ange.
o )dvocate" In t&is role( c&ange agents focus on inspiring people to adopt t&e
c&ange. Lnli1e t&e facilitator( &e does not re"ain neutral( 'ut uses any tactic
+persuasion( "anipulation( po0er etc- to "a1e t&e c&anges accepted.
< Especially in t&e case of eBusiness transfor"ation( 0&ere organi4ational and I,
c&anges relate to infrastructure and issues of co""onality and interopera'ility( t&e
ad.ocate "odel see"s to 'e appropriate.
Pressures Forcing Business
Changes < !o"petition
o Ciercer > 5ore Flo'al
< !usto"ers &a.e 'eco"e increasingly
de"anding < Integrated Be"and
+,ra.el( !arRental etc.-
< Cir"s as1 t&e"sel.es :0&ic& of "y co"petences are uniAue and of
core i"portanceE; o E.g. O Bajaj e/its t&e scooter "ar1et.
< !o"pany con*guration c&anges due to outsourcing and insourcing.
Business ,odels
,&ere are .arious de*nitions for Business 5odels. ,&e de*nitions c&ange 'ased on t&e
paradig" and t&e conte/t 'eing applied. Det9s loo1 at eac& definition8
< Participants in a 4oint business venture"
o Specify t&e relations&ips 'et0een different participants in a co""ercial
.enture( t&e 'ene*ts > costs to eac& and t&e flo0s of re.enues. It addresses
a si"ple eAuation +pro*tV re.enuecost- irrespecti.e of t&e "odel.
o Bescri'es &o0 t&e enterprise produces( deli.ers and sells its products or ser.ices(
t&us s&o0ing &o0 it deli.ers .alue to t&e custo"er and &o0 it created 0ealt&.
< Process & structure of a business organi*ation"
o Refers to t&e structures > processes in place to operationali4e t&e
strategy of 'usiness8 o !an 'e descri'ed as8
An arc&itecture for t&e product( ser.ice >
infor"ation 3o0s= A description of t&e .arious
'usiness actors > t&eir roles=
A description of t&e potential 'usiness 'ene*ts for t&e
.arious actors= A description of t&e sources of re.enues.
< Perspective of a mar1etplace.
o Be*nition can 'e analy4ed fro" .arious
perspecti.es8 B2B( B2! acti.ities or
'ot&E
$osition in t&e .alue c&ainE
6alue proposition > target custo"ersE
Speci*c re.enue "odels for generation of it s
inco"e strea"sE RepresentationE $&ysical or 6irtual
or co"'inationE
< Perspective of eBusiness
o A descripti.e representation of t&e planned acti.ities of an enterprise t&at
in.ol.es ? integral co"ponents 0&ic& specify8
Internal aspects of a 'usiness .enture
,ype of relations&ips of t&e enterprise 0it& its e/ternal 'usiness
en.iron"ent and its [email protected] 1no0ledge regarding t&ese relations&ips
Ho0 t&e infor"ation assets are e"'edded in t&e 'usiness .enture.
< ) business model can be viewed as an e5ternali*ation of a firms
internal business processes o Boes not in.ol.e internal 'usiness
process co"ple/ity.
< :hen ta1ing the internal aspects of a business into account the following
elements need to be de%ned"
o $roducts or
Ser.ices o Sources
of re.enue o
Acti.ities
o 7rgani4ation of t&e *r"
E-Business ,odels
EBusiness "odels are classi*ed as follo0s8
< Internet Enabled
o !ategori4ed 'ased on increasing functionality( inno.ation(
integration and .alue. < >alue :eb
o Assuredly not a recipe for success 'ut preli"inary conceptions of an e"erging
for" of a fluid and 3e/i'le organi4ation.
o 5o.e fro" 0edoe.eryt&ingoursel.es unless +.alue generated 'y single
organi4ation- to 0edo not&ingoursel.esunless +.alue generated 'y t&e
net0or1-.
< EBusiness Enabled
o Especially .alid for B2B
conte/ts o 5 Representati.e
Business "odels o ,ele
0o r 1ing 5 o del8
!olla'oration using co""unication tec&nologies
!lassic e/a"ple is Electronic 5anufacturing Ser.ices
+Solectron- o 6irtual 7 rg a n i4ation 5ode l 8
!ollection of geograp&ically dispersed indi.iduals( groups and
organi4ational units. E/a"ple8 FeneraD ife +Insurance-
o $r o cess 7ut s ourcing
5 od el8 E/a"ple8
B$7s( IB5
o !o l la' o rat i .e $ roduct Be . elop" e nt 5 o del
E/a"ple8 Auto"o'ile 5anufacture
O C7RB o 6alue !& a in Integr a tion 5od e l8
Lsed to i"pro.e co""unication > colla'oration 'et0een all supply
c&ain parties. < ,ar1et Participants
o 5ore generic classi*cation of Internet Based
Business 5odels < C(bermediaries
o In disagree"ent 0it& t&e 0idely accepted idea t&at eBusiness 0ill cause industry
.alue c&ains to 'e restructured to suc& an e/tent t&at inter"ediation 0ill no
longer 'e a pro"inent feature.
o ,&e real trend "ig&t just 'e to0ards an increase in inter"ediation 'y cy'er
"ediaries.
7rgani4ations 0&ic& operate in electronic "ar1ets to facilitate
e/c&anges 'et0een producers and consu"ers 'y "eeting t&e
needs of 'ot&.
Birectories of Birectory Ser.ices
Inter"ediaries 6irtual 5alls
%e'site
E.aluators
Auditors
Spot 5ar1et 5a1ers
Cinancial Inter"ediaries +ES!R7% Ser.ice for 7nline purc&ases-.
+ote" For details on all eBusiness models, !efer ,ichael Papa*oglou, @eBusiness
; $rgani*ational & 6echnical FoundationsA
Integration of )pplications
e-Business Integration .Patterns/
eBusiness Integration occurs in as "any for"s as t&ere are eBusinesses. At first glance(
integration pro'le"s and t&e corresponding solutions are seldo" identical. Met( upon
closer e/a"ination( you disco.er t&at integration solutions can actually 'e classi*ed into
co""on categories. Eac& of t&ese categories descri'es 'ot& a )type) of integration
pro'le" as 0ell as a solution "et&od. ,&ese categories are called integration patterns.
Integration patterns &elp you understand t&e different "et&ods a.aila'le to you for a gi.en
type of integration pro'le". ,&ey allo0 you to ta1e a step 'ac1 and understand t&e
di@erences in t&e .arious scenarios and appreciate t&e di@erent approac&es to integration.
Cinally( t&ey allo0 you to .ie0 )integration in t&e 'ig picture.) Mou can learn to 'rea1 do0n
0&at "ay 'e a co"ple/ integration into conceptual categories and understand 0&ic&
tec&nologies to apply.
:hat )re Integration Patterns=
A pattern is co""only de*ned as a relia'le sa"ple of traits( acts( tendencies( or ot&er
o'ser.a'le c&aracteristics. In soft0are de.elop"ent( you "ay 'e fa"iliar 0it& t&e idea of
design patterns or process patterns. Besign patterns syste"atically descri'e o'ject designs
t&at can 'e e"ployed for a co""on set of pro'le"s. Si"ilarly( process patterns descri'e
pro.en "et&ods and processes used in soft0are de.elop"ent. In practice( patterns are
si"ply a logical classification of co""only recurring actions( tec&niAues( designs( or
organi4ations. %&at are integration patternsE Integration patterns e"erge fro"
classi*cation of standard solutions for integration scenarios. ,&ey are not patterns of design
or code. Gor are t&ey patterns of operational processes for an integration project. Instead(
eac& integration pattern defines a type of integration pro'le"( a solution tec&niAue( as 0ell
as para"eters applied for eBusiness Integration.
Collo0ing are se.en co""on eBusiness Integration patterns. ,&ey are not "eant to 'e
co"pre&ensi.e( 'ut t&ey co.er "ost of t&e co""on integration scenarios i"ple"ented
today. ,&ey enco"pass 'ot& EAI scenarios as 0ell as B2Bi scenarios8
W E)I .intraenterprise/
Patterns 1. Bata'ase
Replication
2. SingleStep Application
Integration ?. 5ultiStep
Application Integration .
Bro1ering Application
W BBBi .interenterprise/ Patterns
5. Applicationto
Application B2Bi #. Bata
E/c&ange B2Bi
Q. B2B $rocess Integration
,&e EAI $atterns represent patterns co""only applied 0it&in a corporate enterprise(
0&ereas t&e B2Bi $atterns represent t&e di@erent "et&ods in conducting integrated B2B
transactions. ,&e follo0ing sections pro.ide a closer loo1 at eac& of t&ese patterns and
discuss so"e of t&e details.
atabase !eplication
,&e Bata'ase Replication pattern "ay 'e t&e "ost pre.alent pattern of EAI integration today.
Bata'ase replication in.ol.es "anaging copies of data o.er t0o or "ore data'ases(
resulting in redundant data. !o"panies engage in data'ase replication for nu"erous
reasons. 7ne reason is t&at "any organi4ations are 'eco"ing "ore distri'uted in t&eir
operations( reAuiring "ultiple copies of t&e sa"e data o.er se.eral p&ysical locations.
Replication is also a "eans of data reco.ery. In "any organi4ations( an acti.e secondary
data'ase is "aintained for data reco.ery purposes. In t&e e.ent t&at t&e production
data'ase needs to 'e reco.ered( t&e secondary replicated data'ase can 'e used. ,&is also
applies for )&ig& a.aila'ility) syste"s. In t&ese situations( a redundant copy of )li.e) data
is
"aintained to ensure t&at if t&e *rst syste" is not a.aila'le( t&e redundant data'ase syste"
is acti.ated. ,&e t0o general categories for data'ase replication are sync&ronous and
async&ronous replication.
-ingle--tep )pplication Integration
,&e SingleStep Application Integration +SSAI- pattern e/tends t&e async&ronous data'ase
replication pattern. Instead of focusing on data consistency 'et0een t0o data'ases( t&e
SSAI pattern integrates data 'et0een applications( "o.ing data fro" one conte/t to
anot&er. It does so 'y translating data synta/ of t&e source "essage and refor"atting data
ele"ents into a ne0 target "essage. It is )single step) 'ecause it reAuires an inter"ediary
'ro1er to "ap source "essages to target "essages. ,ypically( it is an e/tension of t&e
async&ronous replication tec&nology( in t&at it utili4es 5essage Xueuing 5iddle0are suc& as
5X Series. It is just as li1ely to 'e i"ple"ented 0it& t&e less sop&isticated C,$ in a 'atc&
"ode. In eit&er case( t&e point is t&at it does "ore t&an si"ply "o.e data fro" point A to
point B for consistency9s sa1e. %&ereas( in t&e replication pattern 'ot& t&e source and target
data "odels are li1ely si"ilar( if not identical at ti"es( t&is is not necessarily t&e case for t&e
SSAI pattern. ,&e o'jecti.e &ere is not data consistency( 'ut application data integration.
,ulti--tep )pplication Integration
,&e 5ultiStep Application Integration +5SAI- pattern is an e/tension of t&e SSAI pattern.
5SAI ena'les t&e integration of n +source- to " +target- applications. It addresses "anyto
"any integration( 0&ic& SSAI cannot( 'y pro.iding 0&at is 1no0n as seAuential logical
processing. In ot&er 0ords( steps in t&is pattern are processed seAuentially( and rules
applied are Boolean logical in nature. Di1e t&e singlestep pattern( 5SAI reAuires an
inter"ediary to 'ro1er t&e transaction of data 'et0een applications. It is often 'uilt around an
async&ronous e.ent 'ased syste" and typically is i"ple"ented t&roug& t&e use of
5essage Xueuing 5iddle0are as 0ell. ,&e async&ronous e.ent'ased approac& creates
loose coupling. Alt&oug& eac& syste" is p&ysically independent( t&ey are logically
dependent. In ot&er 0ords( interdependencies e/ist 'et0een t&e application e.ents t&at
can 'e e/pressed in ter"s of transfor"ations and data integration rules. Bata ele"ents fro"
one application can dri.e t&e retrie.al or processing of "essages in anot&er application. ,&e
si"plest "ultistep e/a"ple in Cigure ?.? in.ol.es t&ree applications in 0&ic& a "essage
fro" application A is co"'ined 0it& a "essage fro" application B t&at is refor"atted for a
target application !. It is co""on for a data ele"ent fro" application A to act as a 1ey to
dri.e t&e reAuest for infor"ation fro" application B.
Bro1ering )pplication
At ti"es integrating t0o applications is not principally a "atter of integrating data( 'ut
integrating 'usiness logic. ,&e Bro1ering Application pattern addresses t&e use of
inter"ediary application logic to lin1 toget&er t0o or "ore applications. In plain ter"s( it
"eans t&at custo" application code is 0ritten containing logic to 'ro1er interactions 'et0een
t&e disparate applications. ,&is custo" 'ro1ering application sits in t&e "iddle as an
inter"ediary for processing reAuests fro" di@erent applications
,&e use of t&is solution pattern is particularly applica'le in t&e scenarios 'elo08
W Applications Geed to Reuse Dogic
W Applications Din1ed 'y !o"ple/ Dogic
W Applications Lni*ed ,&roug& Lser Interface
)pplication-to-)pplication BBBi
Go0 youKre ready to "o.e 'eyond EAI to learn a'out ApplicationtoApplication B2Bi(
e/tending integration 'eyond t&e corporate enterprise. I 0ill descri'e four additional
patterns related speci*cally to B2B integration( 'eginning *rst 0it& t&e Applicationto
Application B2Bi pattern. ,&e ApplicationtoApplication pattern is t&e logical e/tension of
0&at occurs in EAI. %&en EAI .endors tout t&eir products as 'eing B2Bi( t&is speci*c pattern
is 0&at t&ey &a.e in "ind. Ho0e.er( as you 0ill disco.er( t&is is not t&e only pattern and
.ery li1ely not e.en t&e pri"ary pattern for B2Bi. ApplicationtoApplication B2Bi( 0&ic& is
often referred to as interenterprise integration( in.ol.es corporate
entities lin1ing t&eir applications directly to t&e applications of t&eir partners or custo"ers. In
practice( t&is type of integration is often i"ple"ented as part of a supply c&ain of goods and
ser.ices to t&e custo"er.
,&is e/tension for interenterprise integration "eans t&at a nu"'er of additional issues need to
'e accounted for8
W Security
W Cederated !ontrol
W Syste"s 5anage"ent
ata E5change BBBi
,&e li"itation of t&e ApplicationtoApplication B2Bi pattern is t&at it can 'e "ore
de"anding to i"ple"ent. It necessitates t&at eac& participant &andles and e/ternali4es
application nati.e data directly. ,&is "a1es it di@icult to scale t&e B2B interaction "odel
rapidly 0&en suc& a de"and is placed on t&e participants. ,&e opti"al solution is to pro.ide a
rapidly scala'le B2Bi "odel in 0&ic& participants can e/c&ange data freely 0it& "ini"al
e/pectation on t&eir infrastructure. ,&e Bata E/c&ange B2Bi pattern ena'les B2B
transactions predicated on a co""on data e/c&ange for"at. It is t&e "ost 0idely applied
pattern for B2B co""erce today. Bata E/c&ange B2Bi is [email protected] 'ecause it is si"ple in
concept and &as 'een in use since t&e days of Electronic Bata Interc&ange +EBI-( t&e
forerunner to today9s B2B o.er t&e Internet.
Alt&oug& t&ere is a significant incu"'ency of legacy EBI transactions( t&e P5D'ased B2B 0ill
ulti"ately displace EBI as t&e pri"ary "ec&anis" for eBusiness transactions. P5D'ased
data pac1ets are trans"itted 'et0een t0o 'usiness entities t&roug& t&e use of a data
e/c&ange gate0ay ser.ice on 'ot& ends. 7ne of t&e pri"ary responsi'ilities of t&e
gate0ay ser.ice is to prepare t&e data pac1ets 'y placing t&e" 0it&in a security en.elope.
,&e B2B gate0ay ser.ice supports security standards suc& as 5I5E( P.50S( and S/Iey. It is
also responsi'le for routing data t&roug& a standard transport. 5ost B2B gate0ay ser.ices
pro.ide nu"erous transport options including H,,$S( C,$( and ,!$/I$ Soc1ets. Ho0e.er( upon
e/a"ination( you 0ill *nd t&at "ost B2Bi transactions still deli.er P5D docu"ents o.er an
H,,$S pipe.
BBB Process Integration
E.en 0it& industry0ide initiati.es suc& as RosettaGet( a pointtopoint data e/c&ange
t&at "anages static interactions &as so"e li"itations. If !orporation A 0ants to purc&ase
o2ce supplies fro" Bepot P( it "ust agree a&ead of ti"e on t&e content of t&e docu"ents
e/c&anged and 'uying process. ,&is is( of course( to 'e e/pected. Ho0e.er( 0&at if t&e
situation in.ol.es "anaging "ultiple suppliers or if t&e interactions 'eco"e "ore co"ple/E
Cor instance( a scenario in 0&ic& suppliers openly 'id to co"pete on pricing 0ill increase t&e
di"ensions of process interactions. In t&at case( "anaging t&e B2B transaction is no longer
an acti.ity of "anaging a single pointtopoint interaction. Instead( it 'eco"es a c&allenge of
"anaging 'usiness processes t&at are dyna"ic rat&er t&an static.
,&e B2B $rocess Integration pattern ta1es t&e li"itations raised 'y t&e Bata E/c&ange
pattern and addresses t&e" 'y pro.iding Business $rocess Integration +B$I- ser.ices. Yust as
t&e Bata E/c&ange pattern allo0s participants to "anage data e/c&anges dyna"ically
t&roug& P5D'ased docu"ents( t&e B2B $rocess Integration pattern allo0s t&e participants to
"anage processes in t&e sa"e 0ay.
,&erefore( ric&er( "ore co"ple/ relations&ips can occur 'et0een trading partners. B2B
$rocess Integration pattern can 'e i"ple"ented as one of t0o .ariations8 !losed $rocess B2Bi
or 7pen $rocess B2Bi. Mou "ig&t argue t&at eac& of t&ese .ariations constitutes an indi.idual
pattern( 'ut 'ecause t&ey s&are t&e co""on attri'ute of 'eing process focused( I &a.e
decided to treat t&e" as .ariations to t&e B2B $rocess Integration pattern.
)pproaches to ,iddleware
5iddle0are is co"puter soft0are t&at connects soft0are co"ponents or so"e people and
t&eir applications. ,&e soft0are consists of a set of ser.ices t&at allo0s "ultiple processes
running on one or "ore "ac&ines to interact. ,&is tec&nology e.ol.ed to pro.ide for
interopera'ility in support of t&e "o.e to co&erent distri'uted arc&itectures(
0&ic& are "ost often used to support and si"plify co"ple/ distri'uted applications. It
includes 0e' ser.ers( application ser.ers( and si"ilar tools t&at support application
de.elop"ent and deli.ery. 5iddle0are is especially integral to "odern infor"ation
tec&nology 'ased on P5D( S7A$( %e' ser.ices( and ser.iceoriented arc&itecture.
5iddle0are sits )in t&e "iddle) 'et0een application soft0are t&at "ay 'e 0or1ing on
di@erent operating syste"s. It is si"ilar to t&e "iddle layer of a t&reetier single syste"
arc&itecture( e/cept t&at it is stretc&ed across "ultiple syste"s or applications. E/a"ples
include EAI soft0are( teleco""unications soft0are( transaction "onitors( and "essaging
andAueueing soft0are.
,&e distinction 'et0een operating syste" and "iddle0are functionality is( to so"e e/tent(
ar'itrary. %&ile core 1ernel functionality can only 'e pro.ided 'y t&e operating syste" itself(
so"e functionality pre.iously pro.ided 'y separately sold "iddle0are is no0 integrated in
operating syste"s. A typical e/a"ple is t&e ,!$/I$ stac1 for teleco""unications(
no0adays included in .irtually e.ery operating syste".
In si"ulation tec&nology( "iddle0are is generally used in t&e conte/t of t&e &ig& le.el
arc&itecture +HDA- t&at applies to "any distri'uted si"ulations. It is a layer of soft0are t&at
lies 'et0een t&e application code and t&e runti"e infrastructure. 5iddle0are generally
consists of a li'rary of functions( and ena'les a nu"'er of applicationsT si"ulations or
federates in HDA ter"inologyTto page t&ese functions fro" t&e co""on li'rary rat&er t&an
recreate t&e" for eac& application
e%nition of ,iddleware
Soft0are t&at pro.ides a lin1 'et0een separate soft0are applications. 5iddle0are is
so"eti"es called plu"'ing 'ecause it connects t0o applications and passes data 'et0een
t&e". 5iddle0are allo0s data contained in one data'ase to 'e accessed t&roug& anot&er.
,&is definition 0ould *t enterprise application integration and data integration soft0are.
7'ject%e' defines "iddle0are as8 ),&e soft0are layer t&at lies 'et0een t&e operating
syste" and applications on eac& side of a distri'uted co"puting syste" in a net0or1.)
5iddle0are is co"puter soft0are t&at connects soft0are co"ponents or applications. ,&e
soft0are consists of a set of ser.ices t&at allo0s "ultiple processes running on one or "ore
"ac&ines to interact. ,&is tec&nology e.ol.ed to pro.ide for interopera'ility in support of t&e
"o.e to co&erent distri'uted arc&itectures( 0&ic& are "ost often used to support and si"plify
co"ple/( distri'uted applications.
It includes 0e' ser.ers( application ser.ers( and si"ilar tools t&at support application
de.elop"ent and deli.ery. 5iddle0are is especially integral to "odern infor"ation tec&nology
'ased on P5D( S7A$( %e' ser.ices( and ser.ice oriented arc&itecture.
In si"ulation tec&nology( "iddle0are is generally used in t&e conte/t of t&e &ig& le.el
arc&itecture +HDA- t&at applies to "any distri'uted si"ulations. It is a layer of soft0are t&at
lies 'et0een t&e application code and t&e runti"e infrastructure. 5iddle0are generally
consists of a li'rary of functions( and ena'les a nu"'er of applicationsT si"ulations or
federates in HDA ter"inologyTto page t&ese functions fro" t&e co""on li'rary rat&er t&an
recreate t&e" for eac& application.
$rigin of ,iddleware
5iddle0are is a relati.ely ne0 addition to t&e co"puting landscape. It gained popularity in
t&e 1SR0s as a solution to t&e pro'le" of &o0 to lin1 ne0er applications to older legacy
syste"s( alt&oug& t&e ter" &ad 'een in use since 1S#R. It also facilitated distri'uted
processing( t&e connection of "ultiple applications to create a larger application( usually o.er
a net0or1.
7se of middleware
5iddle0are ser.ices pro.ide a "ore functional set of application progra""ing interfaces to
allo0 an application to +0&en co"pared to t&e operating syste" and net0or1 ser.ices.-8
W Docate transparently across t&e net0or1( t&us pro.iding interaction 0it& anot&er
ser.ice or application
W Cilter data to "a1e t&e" friendly usa'le or pu'lic .ia anony"i4ation process for
pri.acy protection +for e/a"ple-
W Be independent fro" net0or1
ser.ices W Be relia'le and
al0ays a.aila'le
W Add co"ple"entary attri'utes li1e se"antics
5iddle0are offers so"e uniAue tec&nological ad.antages for 'usiness and industry. Cor
e/a"ple( traditional data'ase syste"s are usually deployed in closed en.iron"ents 0&ere
users access t&e syste" only .ia a restricted net0or1 or intranet +e.g.( an enterpriseKs
internal net0or1-. %it& t&e p&eno"enal gro0t& of t&e %orld %ide %e'( users can access
.irtually any data'ase for 0&ic& t&ey &a.e proper access rig&ts fro" any0&ere in t&e
0orld. 5iddle0are addresses t&e pro'le" of .arying le.els of interopera'ility a"ong
di@erent data'ase structures. 5iddle0are facilitates transparent access to legacy data'ase
"anage"ent syste"s +BB5Ss- or applications .ia a 0e' ser.er 0it&out regard to data'ase
speci*c c&aracteristics .
Businesses freAuently use "iddle0are applications to lin1 infor"ation fro" depart"ental
data'ases( suc& as payroll( sales( and accounting( or data'ases &oused in "ultiple
geograp&ic locations. In t&e &ig&ly co"petiti.e &ealt&care co""unity( la'oratories "a1e
e/tensi.e use of "iddle0are applications for data "ining( la'oratory infor"ation syste"
+DIS- 'ac1up( and to co"'ine syste"s during &ospital "ergers. 5iddle0are &elps 'ridge t&e
gap 'et0een separate DISs in a ne0ly for"ed &ealt&care net0or1 follo0ing a &ospital 'uyout.
%ireless net0or1ing de.elopers can use "iddle0are to "eet t&e c&allenges associated 0it&
0ireless sensor net0or1 +%SG-( or %SG tec&nologies. I"ple"enting a "iddle0are application
allo0s %SG de.elopers to integrate operating syste"s and &ard0are 0it& t&e 0ide .ariety of
.arious applications t&at are currently a.aila'le.
5iddle0are can &elp soft0are de.elopers a.oid &a.ing to 0rite application progra""ing
interfaces +A$I- for e.ery control progra"( 'y ser.ing as an independent progra""ing
interface for t&eir applications. Cor Cuture Internet net0or1 operation t&roug& traffic
"onitoring in "ultido"ain scenarios( using "ediator tools +"iddle0are- is a po0erful &elp
since t&ey allo0 operators( searc&ers and ser.ice pro.iders to super.ise Xuality of ser.ice and
analyse e.entual failures in teleco""unication ser.ices.
Cinally( eco""erce uses "iddle0are to assist in &andling rapid and secure transactions o.er
"any di@erent types of co"puter en.iron"ents. In s&ort( "iddle0are &as 'eco"e a critical
ele"ent across a 'road range of industries( t&an1s to its a'ility to 'ring toget&er resources
across dissi"ilar net0or1s or co"puting platfor"s.
6(pes of middleware
Hur0it4Ks classi*cation syste" organi4es t&e "any types of "iddle0are t&at are
currently a.aila'le. ,&ese classifications are 'ased on scala'ility and reco.era'ility8
W Re"ote $rocedure !all T !lient "a1es calls to procedures running on re"ote
syste"s. !an 'e async&ronous or sync&ronous.
W 5essage 7riented 5iddle0are T 5essages sent to t&e client are collected and
stored until t&ey are acted upon( 0&ile t&e client continues 0it& ot&er processing.
W 7'ject ReAuest Bro1er T ,&is type of "iddle0are "a1es it possi'le for applications
to send o'jects and reAuest ser.ices in an o'jectoriented syste".
W SXDoriented Bata Access T "iddle0are 'et0een applications and data'ase ser.ers.
W E"'edded 5iddle0are T co""unication ser.ices and integration interface
soft0are/*r"0are t&at operates 'et0een e"'edded applications and t&e real ti"e
operating syste".
W 7t&er sources include t&ese additional classi*cations8
W ,ransaction processing "onitors T $ro.ides tools and an en.iron"ent to
de.elop and deploy distri'uted applications.
W Application ser.ers T soft0are installed on a co"puter to facilitate t&e ser.ing
+running- of ot&er applications.
W Enterprise Ser.ice Bus T An a'straction layer on top of an Enterprise 5essaging
Syste".
!PC
In co"puter science( a re"ote procedure call +R$!- is an interprocess co""unication
t&at allo0s a co"puter progra" to cause a su'routine or procedure to e/ecute in anot&er
address space +co""only on anot&er co"puter on a s&ared net0or1- 0it&out t&e
progra""er e/plicitly coding t&e details for t&is re"ote interaction. ,&at is( t&e progra""er
0rites essentially t&e sa"e code 0&et&er t&e su'routine is local to t&e e/ecuting progra"( or
re"ote. %&en t&e soft0are in Auestion uses o'jectoriented principles( R$! is called re"ote
in.ocation or re"ote "et&od in.ocation. Gote t&at t&ere are "any di@erent +often
inco"pati'le- tec&nologies co""only used to acco"plis& t&is.
<istor( and origins
,&e idea of R$! +Re"ote $rocedure !all- goes 'ac1 at least as far as 1SQ#( 0&en it 0as
descri'ed in RC! Q0Q. 7ne of t&e first 'usiness uses of R$! 0as 'y Pero/ under t&e na"e
)!ourier) in 1SR1. ,&e *rst popular i"ple"entation of R$! on Lni/ 0as SunKs R$! +no0
called 7G! R$!-( used as t&e 'asis for GCS +Sun-. Anot&er early Lni/ i"ple"entation
0as Apollo !o"puterKs Get0or1 !o"puting Syste" +G!S-. G!S later 0as used as t&e
foundation of B!E/R$! in t&e 7SCKs Bistri'uted !o"puting En.iron"ent +B!E-. A decade later
5icrosoft adopted B!E/R$! as t&e 'asis of t&e 5icrosoft R$! +5SR$!- "ec&anis"( and
i"ple"ented B!75 on top of it. Around t&e sa"e ti"e +"id S0Ks-( Pero/ $AR!Ks IDL( and
t&e 7'ject 5anage"ent FroupKs !7RBA( o@ered anot&er R$! paradig" 'ased on
distri'uted o'jects 0it& an in&eritance "ec&anis".
,essage passing
An R$! is initiated 'y t&e client( 0&ic& sends a reAuest "essage to a 1no0n re"ote ser.er
to e/ecute a speci*ed procedure 0it& supplied para"eters. ,&e re"ote ser.er sends a
response to t&e client( and t&e application continues its process. ,&ere are "any .ariations
and su'tleties in .arious i"ple"entations( resulting in a .ariety of di@erent +inco"pati'le-
R$! protocols. %&ile t&e ser.er is processing t&e call( t&e client is 'loc1ed +it 0aits until t&e
ser.er &as *nis&ed processing 'efore resu"ing e/ecution-. An i"portant di@erence 'et0een
re"ote procedure calls and local calls is t&at re"ote calls can fail 'ecause of unpredicta'le
net0or1 pro'le"s. Also( callers generally "ust deal 0it& suc& failures 0it&out 1no0ing
0&et&er t&e re"ote procedure 0as actually in.o1ed. Ide"potent procedures +t&ose t&at
&a.e no additional e@ects if called "ore t&an once- are easily &andled( 'ut enoug&
di2culties re"ain t&at code to call re"ote procedures is often con*ned to carefully 0ritten
lo0le.el su'syste"s.
6he steps in ma1ing a !PC
1. ,&e client calling t&e !lient stu'. ,&e call is a local procedure call( 0it& para"eters
pus&ed on to t&e stac1 in t&e nor"al 0ay.
2. ,&e client stu' pac1ing t&e para"eters into a "essage and "a1ing a syste" call
to send t&e "essage. $ac1ing t&e para"eters is called "ars&aling.
?. ,&e 1ernel sending t&e "essage fro" t&e client "ac&ine to
t&e ser.er "ac&ine. . ,&e 1ernel passing t&e inco"ing pac1ets
to t&e ser.er stu'.
5. Cinally( t&e ser.er stu' calling t&e ser.er procedure. ,&e reply traces t&e sa"e in
ot&er direction.
-tandard contact mechanisms
,o let di@erent clients access ser.ers( a nu"'er of standardi4ed R$! syste"s &a.e 'een
created. 5ost of t&ese use an interface description language +IBD- to let .arious platfor"s
call t&e R$!. ,&e IBD *les can t&en 'e used to generate code to interface 'et0een t&e
client and ser.er. ,&e "ost co""on tool used for t&is is R$!FEG.
$ther !PC analogues
R$! analogues found else0&ere8
W Ya.aKs Ya.a Re"ote 5et&od In.ocation +Ya.a R5I- A$I pro.ides si"ilar functionality
to standard LGIP R$!
W "et&ods.
W 5odula?Ks Get0or1 7'jects( 0&ic& 0ere t&e 'asis for Ya.aKs R5I
W P5DR$! is an R$! protocol t&at uses P5D to encode its calls and H,,$ as a
transport "ec&anis".
W 5icrosoft .GE, Re"oting o@ers R$! facilities for distri'uted syste"s i"ple"ented
on t&e %indo0s platfor".
W R$y! i"ple"ents R$! "ec&anis"s in $yt&on( 0it& support for
async&ronous calls. W $yro 7'ject 7riented for" of R$! for $yt&on.
W Etc& +protocol- fra"e0or1 for 'uilding
net0or1 ser.ices. W Cace'oo1Ks ,&rift protocol
and fra"e0or1.
W !7RBA pro.ides re"ote procedure in.ocation t&roug& an inter"ediate layer called
t&e )7'ject ReAuest Bro1er)
W BR' allo0s Ru'y progra"s to co""unicate 0it& eac& ot&er on t&e sa"e "ac&ine
or o.er a net0or1. BR' uses
W re"ote "et&od in.ocation +R5I- to pass co""ands and data 'et0een processes.
W A5C allo0s Cle/ applications to co""unicate 0it& 'ac1ends or ot&er applications
t&at support A5C. W Di'e.ent pro.ides a fra"e0or1 for creating R$! ser.ers and
clients.
W %indo0s !o""unication Coundation is an application
!,I
,&e Ya.a Re"ote 5et&od In.ocation Application $rogra""ing Interface +A$I-( or Ya.a R5I(
is a Ya.a application progra""ing interface t&at perfor"s t&e o'jectoriented eAui.alent of
re"ote procedure calls +R$!-.
1. ,&e original i"ple"entation depends on Ya.a 6irtual 5ac&ine +Y65- class
representation "ec&anis"s and it t&us only supports "a1ing calls fro" one Y65
to anot&er. ,&e protocol underlying t&is Ya.aonly i"ple"entation is 1no0n as Ya.a
Re"ote 5et&od $rotocol +YR5$-.
2. In order to support code running in a nonY65 conte/t( a !7RBA .ersion 0as later
de.eloped.
$age ?0
Lsage of t&e ter" R5I "ay denote solely t&e progra""ing interface or "ay signify 'ot& t&e
A$I and YR5$( 0&ereas t&e ter" R5III7$ +read8 R5I o.er II7$- denotes t&e R5I interface
delegating "ost of t&e functionality to t&e supporting !7RBA i"ple"entation.
,&e progra""ers of t&e original R5I A$I generali4ed t&e code so"e0&at to support
di@erent i"ple"entations( suc& as a H,,$ transport. Additionally( t&e a'ility to pass
argu"ents )'y .alue) 0as added to !7RBA in order to support t&e R5I interface. Still( t&e
R5III7$ and YR5$ i"ple"entations do not &a.e fully identical interfaces.
R5I functionality co"es in t&e pac1age ja.a.r"i( 0&ile "ost of SunKs i"ple"entation is
located in t&e sun.r"i pac1age. Gote t&at 0it& Ya.a .ersions 'efore Ya.a 5.0 de.elopers &ad
to co"pile R5I stu's in a separate co"pilation step using r"ic. 6ersion 5.0 of Ya.a and 'eyond
no longer reAuire t&is step.
Yini o@ers a "ore ad.anced .ersion of R5I in Ya.a. It functions si"ilarly 'ut pro.ides "ore
ad.anced searc&ing capa'ilities and "ec&anis"s for distri'uted o'ject applications.
E5ample
,&e follo0ing classes i"ple"ent a si"ple clientser.er progra" using R5I t&at displays a
"essage. !mi-erver classC'istens to !,I re2uests and implements the interface
which is used b( the client to invo1e remote methods.
import 4ava.rmi.+aming=
import 4ava.rmi.!emoteE5ception=
import
4ava.rmi.!,I-ecurit(,anager=
import
4ava.rmi.server.7nicast!emote$b4ect=
import 4ava.rmi.registr(.D=
public class !mi-erver e5tends
LnicastRe"ote7'ject implements R"iSer.erIntf
Z
static public %nal String 5ESSAFE V )Hello 0orld)=
public R"iSer.er+- throws Re"oteE/ception
Z super+-= [
public String get5essage+- Z
return 5ESSAFE= [
public static .oid "ain+String args\]-
Z
Syste".out.println+)R5I ser.er started)-=
// Create and install a security manager
if +Syste".getSecurity5anager+- VV null-
Z
Syste".setSecurity5anager+new R5ISecurity5anager+--=
Syste".out.println+)Security "anager installed.)-=
[ else
Syste".out.println+)Security "anager already
e/ists.)-= tr( //special exception handler for registry
creation
Z DocateRegistry.createRegistry+10SS-=
Syste".out.println+)ja.a R5I registry created.)-=
[
catch +Re"oteE/ception e- Z
//do nothing, error means registry already exists
Syste".out.println+)ja.a R5I registry already
e/ists.)-= [
tr( Z
$age
?1
//Instantiate Rmierver
R"iSer.er o'j V new R"iSer.er+-=
// !ind this o"#ect instance to the name
$Rmierver$
Ga"ing.re'ind+)//local&ost/R"iSer.er)( o'j-=
Syste".out.println+)$eerSer.er 'ound in registry)-=
[
catch +E/ception e- Z
Syste".err.println+)R5I ser.er e/ception8)-=
e.printStac1,race+-=
[ [ [
!mi-erverIntf classCefines the interface that is used b( the client and
implemented b( the server.
import 4ava.rmi.!emote=
import 4ava.rmi.!emoteE5ception=
public interface !mi-erverIntf e5tends
Re"ote Z
public String get5essage+- throws
Re"oteE/ception= [
!miClient classC6his is the client which gets the reference to the remote ob4ect
and invo1es its method to get a message.
import 4ava.rmi.+aming=
import 4ava.rmi.!emoteE5ception=
import
4ava.rmi.!,I-ecurit(,anager= public
class !miClient
Z
// $o"#$ is the reference of the remote o"#ect
R"iSer.erIntf o'j V null=
public String get5essage+- Z
tr( Z
o'j V
+R"iSer.erIntf-Ga"ing.loo1up+)//local&ost/R"iSer.er)-
= return o'j.get5essage+-=
[
catch +E/ception e- Z
Syste".err.println+)R"i!lient e/ception8 ) ^
e.get5essage+--=
e.printStac1,race+-=
return e.get5essage+-= [
[
public static .oid "ain+String args\]-
Z
// Create and install a security manager
if +Syste".getSecurity5anager+- VV null-
Z
Syste".setSecurity5anager+new R5ISecurity5anager+--=
[
R"i!lient cli V new R"i!lient+-=
Syste".out.println+cli.get5essage+--=
[
[
Before running this sub4, we need to ma1e E-tubE %le of interface we used. For this
tas1 we have !,I compiller ErmicE
W +ote" we ma1e stub %le from D.class with implementation remote interface,
not ED.4avaED
$age
?2
r"ic R"iSer.er
server.polic(C6his %le is re2uired on the server to allow 6CP3IP communication for
the remote registr( and for the !,I server.
grant Z
per"ission ja.a.net.Soc1et$er"ission )12Q.0.0.18_)(
)connect(resol.e)=
per"ission ja.a.net.Soc1et$er"ission )12Q.0.0.18_)(
)accept)= [=
6he server.polic( %le should be used using the switch of Fava !6E, e.g."
ja.a.e/e `Bja.a.security.policyVser.er.policy R"iSer.er
client.polic(C6his %le is re2uired on the client to connect to !,I -erver using 6CP3IP.
grant Z
per"ission ja.a.net.Soc1et$er"ission )12Q.0.0.18_)(
)connect(resol.e)=
[=
no.polic(C)lso if (ou have a troubles with connecting, tr( this file for server or
client.
grant Z
per"ission ja.a.security.All$er"ission=
[=
Enterprise )pplication Integration
Enterprise Application Integration +EAI- is de*ned as t&e use of soft0are and co"puter
syste"s arc&itectural principles to integrate a set of enterprise co"puter applications.
Enterprise Application Integration +EAI- is an integration fra"e0or1 co"posed of a
collection of tec&nologies and ser.ices 0&ic& for" a "iddle0are to ena'le integration of
syste"s and applications across t&e enterprise. Supply c&ain "anage"ent applications +for
"anaging in.entory and s&ipping-( custo"er relations&ip "anage"ent applications +for
"anaging current and potential custo"ers-( 'usiness intelligence applications +for finding
patterns fro" e/isting data fro" operations-( and ot&er types of applications +for "anaging
data suc& as &u"an resources data( &ealt& care( internal co""unications( etc- typically
cannot co""unicate 0it& one anot&er in order to s&are data or 'usiness rules.
Cor t&is reason( suc& applications are so"eti"es referred to as islands of auto"ation or
infor"ation silos. ,&is lac1 of co""unication leads to ine2ciencies( 0&erein identical data
are stored in "ultiple locations( or straig&tfor0ard processes are una'le to 'e auto"ated.
Enterprise application integration +EAI- is t&e process of lin1ing suc& applications 0it&in a
single organi4ation toget&er in order to si"plify and auto"ate 'usiness processes to t&e
greatest e/tent possi'le( 0&ile at t&e sa"e ti"e a.oiding &a.ing to "a1e s0eeping
c&anges to t&e e/isting applications or data structures. In t&e 0ords of t&e Fartner Froup(
EAI is t&e :unrestricted s&aring of data and 'usiness processes a"ong any connected
application or data sources in t&e enterprise.;
7ne large c&allenge of EAI is t&at t&e .arious syste"s t&at need to 'e lin1ed toget&er
often reside on di@erent operating syste"s( use di@erent data'ase solutions and di@erent
co"puter languages( and in so"e cases are legacy syste"s t&at are no longer supported 'y
t&e .endor 0&o originally created t&e". In so"e cases( suc& syste"s are du''ed )sto.epipe
syste"s) 'ecause t&ey consist of co"ponents t&at &a.e 'een ja""ed toget&er in a 0ay
t&at "a1es it .ery &ard to "odify t&e" in any 0ay.
Purposes of E)I
EAI can 'e used for di@erent purposes8
W Bata +infor"ation- Integration8 Ensuring t&at infor"ation in "ultiple syste"s is
1ept consistent. ,&is is also 1no0n as EII +Enterprise Infor"ation Integration-.
W 6endor independence8 E/tracting 'usiness policies or rules fro" applications and
i"ple"enting t&e" in t&e EAI syste"( so t&at e.en if one of t&e 'usiness
applications is replaced 0it& a di@erent .endorKs application( t&e 'usiness rules do
not &a.e to 'e rei"ple"ented.
W !o""on Cacade8 An EAI syste" could frontend a cluster of applications(
pro.iding a single consistent access interface to t&ese applications and s&ielding
users fro" &a.ing to learn to interact 0it& different soft0are pac1ages.
E)I patterns
Integration patterns
,&ere are t0o patterns t&at EAI syste"s i"ple"ent8
W 5ediation8 Here( t&e EAI syste" acts as t&e go'et0een or 'ro1er
'et0een +interface or co""unicating- "ultiple applications. %&ene.er an
interesting e.ent occurs in an application +e. g.( ne0 infor"ation created( ne0
transaction co"pleted( etc.- an integration "odule in t&e EAI syste" is noti*ed.
,&e "odule t&en propagates t&e c&anges to ot&er rele.ant applications.
W Cederation8 In t&is case( t&e EAI syste" acts as t&e o.erarc&ing facade across
"ultiple applications. All e.ent calls fro" t&e Koutside 0orldK to any of t&e
applications are frontended 'y t&e EAI syste". ,&e EAI syste" is con*gured to
e/pose only t&e rele.ant infor"ation and interfaces of t&e underlying
applications to t&e outside 0orld( and perfor"s all interactions 0it& t&e
underlying applications on 'e&alf of t&e reAuester.
Bot& patterns are often used concurrently. ,&e sa"e EAI syste" could 'e 1eeping "ultiple
applications in sync +"ediation-( 0&ile ser.icing reAuests fro" e/ternal users against t&ese
applications +federation-.
Access patterns
EAI supports 'ot& async&ronous and sync&ronous access patterns( t&e for"er 'eing typical
in t&e "ediation case and t&e latter in t&e federation case.
Lifetime patterns
An integration operation could 'e s&ortli.ed +e. g.( 1eeping data in sync across t0o
applications could 'e co"pleted 0it&in a second- or longli.ed +e. g.( one of t&e steps could
in.ol.e t&e EAI syste" interacting 0it& a &u"an 0or1 3o0 application for appro.al of a loan
t&at ta1es &ours or days to co"plete-.
E)I topologies
,&ere are t0o "ajor topologies8 &u'andspo1e( and 'us. Eac& &as its o0n ad.antages and
disad.antages. In t&e &u'andspo1e "odel( t&e EAI syste" is at t&e center +t&e &u'-( and
interacts 0it& t&e applications .ia t&e spo1es. In t&e 'us "odel( t&e EAI syste" is t&e 'us +or
is i"ple"ented as a resident "odule in an already e/isting "essage 'us or "essageoriented
"iddle0are-.
6echnologies
5ultiple tec&nologies are used in i"ple"enting eac& of t&e co"ponents of t&e EAI syste"8
Bus3hub
,&is is usually i"ple"ented 'y en&ancing standard "iddle0are products +application
ser.er( "essage 'us- or i"ple"ented as a standalone progra" +i. e.( does not
use any "iddle0are-( acting as its o0n "iddle0are.
)pplication connectivit(
,&e 'us/&u' connects to applications t&roug& a set of adapters +also referred to as
connectors-. ,&ese are progra"s t&at 1no0 &o0 to interact 0it& an underlying 'usiness
application. ,&e adapter perfor"s t0o0ay co""unication( perfor"ing reAuests fro"
t&e &u' against t&e application( and notifying t&e &u' 0&en an e.ent of interest
occurs in t&e application +a ne0 record inserted( a transaction co"pleted( etc.-.
Adapters can 'e specific to an application +e. g.( 'uilt against t&e application .endorKs
client li'raries- or speci*c to a class of applications +e. g.( can interact 0it& any
application t&roug& a standard co""unication protocol( suc& as S7A$ or S5,$-. ,&e
adapter could reside in t&e sa"e process space as t&e 'us/&u' or e/ecute in a re"ote
location and interact 0it& t&e &u'/'us t&roug& industry standard protocols suc& as
"essage Aueues( 0e' ser.ices( or e.en use a proprietary protocol. In t&e Ya.a 0orld(
standards suc& as Y!A allo0 adapters to 'e created in a .endorneutral "anner.
ata format and transformation
,o a.oid e.ery adapter &a.ing to con.ert data to/fro" e.ery ot&er applicationsK for"ats(
EAI syste"s usually stipulate an applicationindependent +or co""on- data for"at.
,&e EAI syste" usually pro.ides a data transfor"ation ser.ice as 0ell to &elp
con.ert 'et0een applicationspecific and co""on for"ats. ,&is is done in t0o steps8
t&e adapter con.erts infor"ation fro" t&e applicationKs for"at to t&e 'usKs co""on
for"at. ,&en( se"antic transfor"ations are applied on t&is +con.erting 4ip
codes to city na"es( splitting/"erging o'jects fro" one application into o'jects in t&e
ot&er applications( and so on-.
Integration modules
An EAI syste" could 'e participating in "ultiple concurrent integration operations at
any gi.en ti"e( eac& type of integration 'eing processed 'y a di@erent integration
"odule. Integration "odules su'scri'e to
e.ents of speci*c types and process noti*cations t&at t&ey recei.e 0&en t&ese e.ents
occur. ,&ese "odules could 'e i"ple"ented in di@erent 0ays8 on Ya.a'ased EAI
syste"s( t&ese could 'e 0e' applications or EYBs or e.en $7Y7s t&at confor" to t&e EAI
syste"Ks specifications.
-upport for transactions
%&en used for process integration( t&e EAI syste" also pro.ides transactional
consistency across applications 'y e/ecuting all integration operations across all
applications in a single o.erarc&ing distri'uted transaction +using t0op&ase co""it
protocols or co"pensating transactions-.
Communication architectures
!urrently( t&ere are "any .ariations of t&oug&t on 0&at constitutes t&e 'est infrastructure(
co"ponent "odel( and standards structure for Enterprise Application Integration. ,&ere
see"s to 'e consensus t&at four co"ponents are essential for "odern enterprise application
integration arc&itecture8
1. A centrali4ed 'ro1er t&at &andles security( access( and co""unication. ,&is can 'e
acco"plis&ed t&roug& integration ser.ers +li1e t&e Sc&ool Interopera'ility Cra"e0or1
+SIC- aone Integration Ser.ers- or t&roug& si"ilar soft0are li1e t&e Enterprise ser.ice
'us +ESB- "odel t&at acts as a S7A$oriented ser.ices "anager.
2. An independent data "odel 'ased on a standard data structure( also 1no0n as a
!anonical data "odel. It appears t&at P5D and t&e use of P5D style s&eets &as
'eco"e t&e de facto and in so"e cases de jure standard for t&is unifor" 'usiness
language.
?. A connector or agent "odel 0&ere eac& .endor( application( or interface can 'uild a
single co"ponent t&at can spea1 nati.ely to t&at application and co""unicate 0it& t&e
centrali4ed 'ro1er.
. A syste" "odel t&at de*nes t&e A$Is( data flo0 and rules of engage"ent to
t&e syste" suc& t&at co"ponents can 'e 'uilt to interface 0it& it in a standardi4ed
0ay.
Alt&oug& ot&er approac&es li1e connecting at t&e data'ase or userinterface le.el &a.e 'een
e/plored( t&ey &a.e not 'een found to scale or 'e a'le to adjust. Indi.idual applications can
pu'lis& "essages to t&e centrali4ed 'ro1er and su'scri'e to recei.e certain "essages fro"
t&at 'ro1er. Eac& application only reAuires one connection to t&e 'ro1er. ,&is central control
approac& can 'e e/tre"ely scala'le and &ig&ly e.ol.a'le. Enterprise Application Integration
is related to "iddle0are tec&nologies suc& as "essageoriented "iddle0are +575-( and
data representation tec&nologies suc& as P5D. 7t&er EAI tec&nologies in.ol.e using 0e'
ser.ices as part of ser.iceoriented arc&itecture as a "eans of integration. Enterprise
Application Integration tends to 'e data centric. In t&e near future( it 0ill co"e to include
content integration and 'usiness processes.
E)I Implementation Pitfalls
In 200? it 0as reported t&at Q0b of all EAI projects fail.
5ost of t&ese failures are not due to t&e soft0are itself or tec&nical dif*culties( 'ut due to
"anage"ent issues. Integration !onsortiu" European !&air"an Ste.e !raggs &as
outlined t&e se.en "ain pitfalls underta1en 'y co"panies using EAI syste"s and e/plains
solutions to t&ese pro'le"s.
G Constant change
,&e .ery nature of EAI is dyna"ic and reAuires dyna"ic project "anagers to "anage
t&eir i"ple"entation.
G -hortage of E)I e5perts
EAI reAuires 1no0ledge of "any issues and tec&nical aspects.
G Competing standards
%it&in t&e EAI *eld( t&e parado/ is t&at EAI standards t&e"sel.es are not uni.ersal.
G E)I is a tool paradigm
EAI is not a tool( 'ut rat&er a syste" and s&ould 'e i"ple"ented as suc&.
G Building interfaces is an art
Engineering t&e solution is not su@icient. Solutions need to 'e negotiated 0it& user
depart"ents to reac& a co""on consensus on t&e *nal outco"e. A lac1 of consensus
on interface designs leads to e/cessi.e e@ort to "ap 'et0een .arious syste"s data
reAuire"ents.
G 'oss of detail
Infor"ation t&at see"ed uni"portant at an earlier stage "ay 'eco"e crucial later.
G )ccountabilit(
Since so "any depart"ents &a.e "any con3icting reAuire"ents( t&ere s&ould 'e clear
accounta'ility for t&e syste"Ks *nal structure.
%ther potential pro"lems may arise in these areas&
G Emerging !e2uirements
EAI i"ple"entations s&ould 'e e/tensi'le and "odular to allo0 for future c&anges.
G Protectionism
,&e applications 0&ose data is 'eing integrated often 'elong to di@erent depart"ents
t&at &a.e tec&nical( cultural( and political reasons for not 0anting to s&are t&eir data
0it& ot&er depart"ents
)dvantages and
isadvantages W
Ad.antages
o Real ti"e infor"ation access a"ong syste"s
o Strea"lines 'usiness processes and &elps raise
organi4ational e@iciency o 5aintains infor"ation integrity
across "ultiple syste"s
o Ease of de.elop"ent and
"aintenance W Bisad.antages
o Hig& initial de.elop"ent costs( especially for s"all and "idsi4ed 'usinesses
+S5Bs-
o ReAuire a fair a"ount of up front 'usiness design( 0&ic& "any "anagers are not
a'le to en.ision or not 0illing to in.est in. 5ost EAI projects usually start o@
as pointtopoint e@orts( .ery soon 'eco"ing un"anagea'le as t&e nu"'er of
applications increase
-oftware as a -ervice .-aa-/
Soft0are as a Ser.ice is a relati.ely ne0 trend a"ong soft0are pro.iders in 0&ic& an
application is designed to 'e accessed on t&e %e' rat&er t&an installed on t&e user9s
0or1station. ,&oug& t&e soft0are is %e''ased( it is not 'ro0ser'ased. Instead( t&e
soft0are pro.ider de.elops a t&in client application t&at t&e user can do0nload one ti"e to
t&eir 0or1station. 7nce t&e application is installed( t&e user9s 0or1station "aintains
constant co""unication 0it& t&e soft0are pro.ider9s ser.er 0&ene.er t&e application is in
use. ,&e application is called a t&in client 'ecause it pro.ides a considera'ly s"aller
do0nload t&an locally &oused applications( or fat clients( 0&ile "aintaining t&e sa"e ro'ust
functionality. ,&e &osted ser.er &ouses all of t&e user9s data= none is stored on t&e user9s
co"puter. ,&ere are "any 'enefits to using &osted soft0are.
-$)
Ser.iceoriented arc&itecture +S7A- is a &ot topic in enterprise co"puting 'ecause "any I,
professionals see t&e potential of an S7A especially a 0e' ser.ices'ased S7A in
dra"atically speeding up t&e application de.elop"ent process. ,&ey also see it as a 0ay
to 'uild applications and syste"s t&at are "ore adapta'le( and in doing so( t&ey see I,
'eco"ing "ore agile in responding to c&anging 'usiness needs. Got only is S7A a &ot topic(
'ut itKs clearly t&e 0a.e of t&e future. Fartner reports t&at )By 200R( S7A 0ill 'e a
pre.ailing soft0are engineering practice( ending t&e 0year do"ination of "onolit&ic
soft0are arc&itecture) and t&at ),&roug& 200R( S7A and 0e' ser.ices 0ill 'e i"ple"ented
toget&er in "ore t&an Q5 percent of ne0 S7A or 0e' ser.ices projects.)
But despite t&is strong trend( so"e in t&e I, co""unity donKt feel t&at t&e 0e' ser.ices
underpinning for an S7A is "ature enoug& for t&eir enterprise to consider "igration to a
ser.iceoriented arc&itecture. Cor ot&ers( t&e ter"s ser.iceoriented arc&itecture and 0e'
ser.ices dra0 a 'lan1 stare. An earlier article( presented a 'rief o.er.ie0 of S7A and t&e role
of 0e' ser.ices in reali4ing it. ,&is article supple"ents t&at earlier article. If youKre not fa"iliar
0it& S7A and 0e' ser.ices( t&is article ai"s to fa"iliari4e you 0it& t&e". It de*nes so"e of
t&e 1ey ter"s and concepts related to S7A and 0e' ser.ices. A critical "ass of 0idely
adopted tec&nologies is a.aila'le no0 to i"ple"ent and use a 0e' ser.ices'ased S7A( and
"ore tec&nologies( as 0ell as tools( are on t&e 0ay. Cigure 1 identi*es so"e of t&ese
tec&nologies and tools. Eac& layer of t&e *gure s&o0s tec&nologies or tools t&at le.erage
tec&nologies in t&e surrounding layers.
A ser.ice oriented
arc&itecture is an infor"ation
tec&nology approac& or
strategy in 0&ic&
applications "a1e use of
+per&aps "ore accurately( rely
on- ser.ices a.aila'le in a
net0or1 suc& as t&e %orld
%ide %e'. I"ple"enting a
ser.iceoriented arc&itecture can
in.ol.e de.eloping
applications t&at use
ser.ices( "a1ing
applications a.aila'le as
ser.ices so t&at ot&er
applications can use t&ose
ser.ices( or 'ot&. A ser.ice
pro.ides a speci*c
function( typically a 'usiness function( suc& as analy4ing an indi.idualKs credit &istory or
processing a purc&ase order. A ser.ice can pro.ide a single discrete function( suc& as
con.erting one type of currency into anot&er( or it can perfor" a set of related 'usiness
functions( suc& as &andling t&e .arious operations in an airline reser.ations syste". Ser.ices
t&at perfor" a related set of 'usiness functions( as opposed to a single function( are said to 'e
)coarse grained.) 5ultiple ser.ices can 'e used toget&er in a coordinated 0ay. ,&e
aggregated( or co"posite( ser.ice can 'e used to satisfy a "ore co"ple/ 'usiness
reAuire"ent. In fact( one 0ay of loo1ing at an S7A is as an approac& to connecting
applications +e/posed as ser.ices- so t&at t&ey can co""unicate 0it& +and ta1e ad.antage
of- eac& ot&er. In ot&er 0ords( a ser.iceoriented arc&itecture is a 0ay of s&aring functions
+typically 'usiness functions- in a 0idespread and 3e/i'le 0ay.
,&e concept of an S7A is not ne0. Ser.iceoriented arc&itectures &a.e 'een used for years.
%&at distinguis&es an S7A fro" ot&er arc&itectures is loose coupling. Doose coupling
"eans t&at t&e client of a ser.ice is essentially independent of t&e ser.ice. ,&e 0ay a
client +0&ic& can 'e anot&er ser.ice- co""unicates 0it& t&e ser.ice doesnKt depend on t&e
i"ple"entation of t&e ser.ice. Signi*cantly( t&is "eans t&at t&e client doesnKt &a.e to 1no0
.ery "uc& a'out t&e ser.ice to use it. Cor instance( t&e client doesnKt need to 1no0 0&at
language t&e ser.ice is coded in or 0&at platfor" t&e ser.ice runs on. ,&e client
co""unicates 0it& t&e ser.ice according to a speci*ed( 0ellde*ned interface( and t&en
lea.es it up to t&e ser.ice i"ple"entation to perfor" t&e necessary processing. If t&e
i"ple"entation of t&e ser.ice c&anges( for instance( t&e airline reser.ations application
is re.ised( t&e client co""unicates 0it& it in t&e sa"e 0ay as 'efore( pro.ided t&at t&e
interface re"ains t&e sa"e. Doose coupling ena'les ser.ices to 'e docu"entoriented +or
docu"entcentric-. A docu"entoriented ser.ice accepts a docu"ent as input( as opposed to
so"et&ing "ore granular li1e a nu"eric .alue or Ya.a o'ject. ,&e client does not 1no0 or
care 0&at 'usiness function in t&e ser.ice 0ill process t&e docu"ent. ItKs up to t&e
ser.ice to deter"ine 0&at 'usiness function +or functions- to apply 'ased on t&e content of
t&e docu"ent. Ho0e.er 0&at is relati.ely ne0 is t&e e"ergence of 0e' ser.ices'ased S7As.
A 0e' ser.ice is a ser.ice t&at co""unicates 0it& clients t&roug& a set of standard protocols
and tec&nologies. ,&ese 0e' ser.ices standards are i"ple"ented in platfor"s and products
fro" all t&e "ajor soft0are .endors( "a1ing it possi'le for clients and ser.ices to
co""unicate in a consistent 0ay across a 0ide spectru" of platfor"s and operating
en.iron"ents. ,&is uni.ersality &as "ade 0e' ser.ices t&e "ost pre.alent approac& to
i"ple"enting an S7A.
7ptionally( an S7A can also include a ser.ice t&at pro.ides a directory or registry of
ser.ices. ,&e registry contains infor"ation a'out t&e ser.ice suc& as itKs interface. A client
can disco.er ser.ices 'y e/a"ining t&e registry. A registry can also 'e coupled 0it& a
repository co"ponent t&at stores additional infor"ation a'out eac& ser.ice. ,&is additional
)"etadata) can include 'usiness process infor"ation suc& as policy state"ents.
:h( -$)
,&ere are "any reasons for an enterprise to ta1e an S7A approac&( and "ore speci*cally( a
0e' ser.ices'ased S7A approac&. So"e of t&e pri"ary reasons are8
!eusabilit(" %&at dri.es t&e "o.e to S7A is reuse of 'usiness ser.ices. Be.elopers 0it&in
an enterprise and across enterprises +particularly( in 'usiness partners&ips- can ta1e t&e
code de.eloped for e/isting 'usiness applications( e/pose it as 0e' ser.ices( and t&en reuse
it to "eet ne0 'usiness reAuire"ents. Reusing functionality t&at already e/ists outside or
inside an enterprise instead of de.eloping code t&at reproduces t&ose functions can result in a
&uge sa.ings in application de.elop"ent cost and ti"e. ,&e 'ene*t of reuse gro0s
dra"atically as "ore and "ore 'usiness ser.ices get 'uilt( and incorporated into di@erent
applications. A "ajor o'stacle in ta1ing ad.antage of e/isting code is t&e uniAueness of
speci*c applications and syste"s. ,ypically( solutions de.eloped in di@erent enterprises(
e.en di@erent depart"ents 0it&in t&e sa"e enterprise( &a.e uniAue c&aracteristics. ,&ey
run in di@erent operating en.iron"ents( t&eyKre coded in di@erent languages( t&ey use
different progra""ing interfaces and protocols. Mou need to understand &o0 and 0&ere
t&ese applications and syste"s run to co""unicate 0it& t&e". ,&e 0or1 in.ol.ed in doing
t&is analysis and t&e de.elop"ent e@ort in tying t&ese pieces toget&er can 'e .ery ti"e
consu"ing. %itness t&e pain I, organi4ations generally encounter 0&en t&ey try to integrate
t&eir applications 0it& syste"s fro" 'usiness partners +or e.en 0it& legacy syste"s fro"
ot&er parts of t&eir o0n co"pany-. In an S7A( t&e only c&aracteristic of a ser.ice t&at a
reAuesting application needs to 1no0 a'out is t&e pu'lic interface. ,&e functions of an
application or syste" +including legacy syste"s- can 'e dra"atically easier to access as a
ser.ice in an S7A t&an in so"e ot&er arc&itecture. So integrating applications and syste"s can
'e "uc& si"pler.
Interoperabilit(" ,&e S7A .ision of interaction 'et0een clients and looselycoupled
ser.ices "eans 0idespread interopera'ility. In ot&er 0ords( t&e o'jecti.e is for clients and
ser.ices to co""unicate and understand eac& ot&er no "atter 0&at platfor" t&ey run on.
,&is o'jecti.e can 'e "et only if clients and ser.ices &a.e a standard 0ay of co""unicating
0it& eac& ot&er a 0ay t&atKs consistent across platfor"s( syste"s( and languages. In
fact( 0e' ser.ices pro.ide e/actly t&at. %e' ser.ices co"prise a "aturing set of protocols
and tec&nologies t&at are 0idely
accepted and used( and t&at are platfor"( syste"( and language independent. In addition(
t&ese protocols and tec&nologies 0or1 across *re0alls( "a1ing it easier for 'usiness partners
to s&are .ital ser.ices. $ro"ising to "a1e t&ings e.en "ore consistent is t&e %SI 'asic
profile( introduced 'y t&e %e' Ser.ices Interopera'ility 7rgani4ation +an organi4ation
c&artered to pro"ote 0e' ser.ices interopera'ility-. ,&e %SI 'asic pro*le identi*es a core
set of 0e' ser.ices tec&nologies t&at 0&en i"ple"ented in di@erent platfor"s and syste"s(
&elps ensure t&at ser.ices on t&ese di@erent platfor"s and syste"s( and 0ritten in different
languages( can co""unicate 0it& eac& ot&er. ,&e %SI 'asic pro*le &as 0idespread
'ac1ing in t&e co"puter industry( .irtually guaranteeing interopera'ility of ser.ices t&at
confor" to t&e profile.
-calabilit(" Because ser.ices in an S7A are loosely coupled( applications t&at use t&ese
ser.ices tend to scale easily certainly "ore easily t&an applications in a "ore tig&tly
coupled en.iron"ent. ,&atKs 'ecause t&ere are fe0 dependencies 'et0een t&e reAuesting
application and t&e ser.ices it uses. ,&e dependencies 'et0een client and ser.ice in a
tig&tlycoupled en.iron"ent are co"pounded +and t&e de.elop"ent e@ort "ade
significantly "ore co"ple/- as an application t&at uses t&ese ser.ices scales up to &andle
"ore users. Ser.ices in a 0e' ser.ices'ased S7A tend to 'e coarsegrained( docu"ent
oriented( and async&ronous. As "entioned earlier( coarsegrained ser.ices o@er a set of
related 'usiness functions rat&er t&an a single function. Cor e/a"ple( a coarsegrained
ser.ice "ig&t &andle t&e processing of a co"plete purc&ase order. By co"parison( a *ne
grained ser.ice "ig&t &andle only one operation in t&e purc&ase order process. Again( as
"entioned earlier( a docu"entoriented ser.ice accepts a docu"ent as input( as opposed to
so"et&ing "ore granular li1e a nu"eric .alue or Ya.a o'ject. An e/a"ple of a docu"ent
oriented ser.ice "ig&t 'e a tra.el agency ser.ice t&at accepts as input a docu"ent t&at
contains tra.el infor"ation for a speci*c trip reAuest. An async&ronous ser.ice perfor"s its
processing 0it&out forcing t&e client to 0ait for t&e processing to *nis&. A sync&ronous
ser.ice forces t&e client to 0ait. ,&e relati.ely li"ited interaction reAuired for a client to
co""unicate 0it& a coarsegrained( async&ronous ser.ice( especially a ser.ice t&at &andles a
docu"ent suc& as a purc&ase order( allo0s applications t&at use t&ese ser.ices to scale
0it&out putting a &ea.y co""unication load on t&e net0or1.
Fle5ibilit(" Dooselycoupled ser.ices are typically "ore 3e/i'le t&an "ore tig&tlycoupled
applications. In a tig&tly coupled arc&itecture( t&e di@erent co"ponents of an application are
tig&tly 'ound to eac& ot&er( s&aring se"antics( li'raries( and often s&aring state. ,&is "a1es
it di2cult to e.ol.e t&e application to 1eep up 0it& c&anging 'usiness reAuire"ents. ,&e
looselycoupled( docu"ent'ased( async&ronous nature of ser.ices in an S7A allo0s
applications to 'e 3e/i'le( and easy to e.ol.e 0it& c&anging reAuire"ents.
Cost Efficienc(" 7t&er approac&es t&at integrate disparate 'usiness resources suc& as
legacy syste"s( 'usiness partner applications( and depart"entspecific solutions are
e/pensi.e 'ecause t&ey tend to tie t&ese co"ponents toget&er in a custo"i4ed 0ay.
!usto"i4ed solutions are costly to 'uild 'ecause t&ey reAuire e/tensi.e analysis(
de.elop"ent ti"e( and effort. ,&eyKre also costly to "aintain and e/tend 'ecause t&eyKre
typically tig&tlycoupled( so t&at c&anges in one co"ponent of t&e integrated solution
reAuire c&anges in ot&er co"ponents. A standards 'ased approac& suc& as a 0e' ser.ices
'ased S7A s&ould result in less costly solutions 'ecause t&e integration of clients and
ser.ices doesnKt reAuire t&e indept& analysis and uniAue code of custo"i4ed solutions.
Also( 'ecause ser.ices in an S7A are looselycoupled( applications t&at use t&ese ser.ices
s&ould 'e less costly to "aintain and easier to e/tend t&an custo"i4ed solutions. In
addition( a lot of t&e %e''ased infrastructure for a 0e' ser.ices 'ased S7A is already in
place in "any enterprises( furt&er li"iting t&e cost. Dast( 'ut not least( S7A is a'out reuse of
'usiness functions e/posed as coarsegrained ser.ices. ,&is is potentially t&e 'iggest cost
sa.ing of all.
-$)P
5ost people tend to t&in1 of S7A$ as not&ing "ore t&an a protocol for Re"ote $rocedure !alls
+R$!- o.er Hyperte/t ,ransfer $rotocol +H,,$-. Ho0e.er( t&is is only one i"ple"entation of
S7A$( 0&ic& is de*ned as a lig&t0eig&t protocol for passing structured and typed data
'et0een t0o peers using P5D. ,&e speci*cation doesn9t reAuire t&e use of H,,$ or e.en a
reAuest/response type of con.ersation. Instead( S7A$ can 'e used 0it& any protocol t&at
supports t&e trans"ission of P5D data fro" a sender to a recei.er. In fact( 'ot&
5icrosoft and IB5 &a.e i"ple"ented S7A$ "essages o.er S5,$( 0&ic& "eans S7A$
"essages can 'e routed t&roug& e"ail ser.ers.
,&e 'otto" line is S7A$ is not&ing "ore t&an a lig&t0eig&t "essaging protocol( 0&ic& can 'e
used to send "essages 'et0een peers. ,&e "ain goals of S7A$ are focused on pro.iding a
co""on 0ay to pac1age "essage data and de*ne encoding rules used to seriali4e and
deseriali4e t&e data during trans"ission. Anot&er goal 0as to pro.ide a "odel t&at can 'e
used to i"ple"ent R$! operations using S7A$. All t&ese goals are considered :ort&ogonal; in
t&e speci*cation( 0&ic& "eans t&ey are independent( 'ut related. Cor e/a"ple8 A S7A$
"essage s&ould define encoding rules( 'ut t&ese rules needn9t 'e t&e sa"e rules defined in
t&e speci*cation.
E/a"ple8
cS7A$`EG68 En.elope
/"lns8S7A$`EG6V
)&ttp8//sc&e"as./"lsoap.org/soap/en.elope
/) S7A$`EG68
encodingStyleV
)&ttp8//sc&e"as./"lsoap.org/soap/encoding/
)O cS7A$`EG68HeaderO
ct8,ransaction /"lns8tV)so"e`
LRI)O S7A$`EG68"ustLnderstandV)1)O
c/t8,ransactionO
c/S7A$`EG68HeaderO
cS7A$`EG68BodyO
c"8FetBoo1$rice /"lns8"V)so"e`
LRI)O ctitleO5y Dife and
,i"esc/titleO caut&orOCeli/
Harrisonc/aut&orO c/"8
FetBoo1$riceO
c/S7A$`EG68BodyO
c/S7A$`En.elopeO
A related standard( S7A$ 5essages %it& Attac&"ents( speci*es t&e for"at of a S7A$
"essage t&at includes attac&"ents +suc& as( i"ages-. S7A$ "essages +0it& or 0it&out
attac&"ents- are independent of any operating syste" or platfor" and can 'e transported
using a .ariety of co""unication protocols( suc& as H,,$ or S5,$. %SI Basic $ro*le 1.0
references S7A$ 1.1.
:-'
Ho0 does a client 1no0 0&at for"at to use in "a1ing a reAuest to a ser.iceE Cor t&at "atter(
&o0 do t&e client and t&e ser.ice 1no0 0&at t&e reAuest "eansE ,&e ans0ers to t&ese
Auestions are pro.ided 'y infor"ation in an P5D
$age
1
docu"ent( called a %SBD docu"ent( t&at contains a description of t&e 0e' ser.iceKs
interface. A %SBD docu"ent contains infor"ation speci*ed in %e' Ser.ice Bescription
Danguage +%SBD-( as de*ned in t&e %SBD specification. %SBD de*nes an P5D sc&e"a for
descri'ing a 0e' ser.ice. ,o unco.er t&e description for a %e' ser.ice( a client needs to *nd
t&e ser.iceKs %SBD docu"ent. 7ne 0ay( per&aps t&e "ost typical 0ay( to do t&is is for t&e
client to *nd a pointer to t&e %SBD docu"ent in t&e 0e' ser.iceKs registration( 0&ic& can
'e in a LBBI registry or an e'P5D registry/repository. A typical scenario is t&at a 'usiness
registers its ser.ice. ,&e registry entry includes a pointer to a %SBD *le t&at contains t&e
%SBD docu"ent for t&e ser.ice. Anot&er 'usiness searc&es t&e registry and *nds t&e
ser.ice. A progra""er uses t&e interface infor"ation in t&e %SBD docu"ent to construct
t&e appropriate calls to t&e ser.ice. A %SBD docu"ent descri'es a 0e' ser.ice as a collection
of a'stract ite"s called )ports) or )endpoints.) A %SBD docu"ent also de*nes t&e actions
perfor"ed 'y a 0e' ser.ice and t&e data trans"itted to t&ese actions in an a'stract 0ay.
Actions are represented 'y )operations() and data is represented 'y )"essages.) A
collection of related operations is 1no0n as a )port type.) A port type constitutes t&e
collection of actions offered 'y a 0e' ser.ice. %&at turns a %SBD description fro" a'stract
to concrete is a )'inding.) A 'inding speci*es t&e net0or1 protocol and "essage for"at
specifications for a particular port type. A port is de*ned 'y associating a net0or1 address
0it& a 'inding. If a client locates a %SBD docu"ent and finds t&e 'inding and net0or1 address
for eac& port( it can call t&e ser.iceKs operations according to t&e speci*ed protocol and
"essage for"at.
Here( for e/a"ple( is a %SBD docu"ent for an online 'oo1 searc& ser.ice. Gotice in t&e
e/a"ple t&at t&e %SBD docu"ent specifies one operation for t&e ser.ice8 getBoo1s8
coperation na"eV)getBoo1s) ...
,&e input "essage for t&e operation( Boo1Searc&Input contains a string .alue na"ed is'n8
cco"ple/,yp
eO callO
cele"ent
na"eV)is'n)
typeV)string)/O
c/allO
,&e output "essage for t&e operation( Boo1Searc&7utput returns a string .alue na"ed title8
cco"ple/,yp
eO callO
cele"ent
na"eV)title)
typeV)string)/O
c/allO
Gotice too t&at t&e %SBD docu"ent specifies a S7A$ protocol 'inding. ,&e style attri'ute in
t&e 'inding ele"ent speci*es t&at t&e recei.er s&ould interpret t&e payload in t&e S7A$
"essage as an R$! "et&od call8
csoap8'inding
transportV)transportV&ttp8//sc&e"as./"lsoap.org/s
oap/&ttp) styleV)rpc)/O
Alternati.ely( t&e style attri'ute in a 'inding ele"ent could specify )docu"ent). In t&at case(
a co"plete docu"ent +typically an P5D docu"ent- 0ould 'e e/c&anged in t&e call. In fact( t&e
docu"ent style of 'inding is "ore typical of ser.ices in an S7A. 7ne of t&e ad.antages of
passing an P5D docu"ent to a ser.ice instead of an R$!style "essage is t&at it tends to 'e
easier to .alidate and to apply 'usiness rules. ,&e %SI Basic $ro*le 1.0 speci*es t&e
constructs in %SBD 1.1 t&at 'e can 'e used to ensure interopera'ility. It also clari*es so"e of
t&e construct descriptions in t&e %SBD 1.1 speci*cation.
7I and ebH,'
As "entioned earlier( an S7A can also include a ser.ice t&at pro.ides a directory or registry of
ser.ices. But &o0 can a ser.ice 'e descri'ed in a registry so t&at a progra" loo1ing for t&at
ser.ice can easily *nd it and understand 0&at it
$age
2
doesE ,&e Lni.ersal Bescription( Bisco.ery( and Integration +LBBI- specifications de*ne &o0
to pu'lis& and disco.er infor"ation a'out ser.ices in a LBBIconfor"ing registry. 5ore
speci*cally( t&e speci*cations de*ne a LBBI sc&e"a and a LBBI A$I. ,&e LBBI sc&e"a
identi*es t&e types of P5D data structures t&at co"prise an entry in t&e registry for a ser.ice.
,&in1 of a LBBI registry as a )Mello0 $ages) for 0e' ser.ices. Di1e t&e Mello0 $ages directory
for p&one nu"'ers( a LBBI registry pro.ides infor"ation a'out a ser.ice suc& as t&e na"e of
t&e ser.ice( a 'rief description of 0&at it does( an address 0&ere t&e ser.ice can 'e
accessed( and a description of t&e interface for accessing t&e ser.ice. ,&e acco"panying
*gure illustrates t&e sc&e"a. It s&o0s t&e fi.e types of P5D data structures t&at co"prise a
registration.
Here is an e/a"ple t&at s&o0s part of a co"plete BusinessEntity structure for a
&ypot&etical co"pany na"ed Boo1s,oFo. ,&e co"pany pro.ides .arious 0e' ser.ices(
including an online 'oo1 ordering ser.ice.
c'usinessEntity 'usinessIeyV)?5ACQC00`11S`11B#`A0B!`
000!0E00A!BB) aut&ori4edGa"eV)0100002!AD)
operatorV)000`
?.i'".co"/ser.ices/uddi)O
cna"eOBoo1s,oFoc/na"eO
cdescription /"l8langV)en)O
,&e source for all professional
'oo1s c/descriptionO
ccontact
sO
ccontact
O
cpersonGa"eOBenja"in
Bossc/personGa"eO cp&oneO
+RQQ-11111
11
c/p&oneO
c/contact
O
c/contacts
O
,&e A$I descri'es t&e S7A$ "essages t&at are used to pu'lis& an entry in a registry( or
disco.er an entry in a registry. Here( for e/a"ple( is a "essage t&at searc&es for all 'usiness
entities in a registry 0&ose na"e 'egins 0it& t&e c&aracters )Boo1s)8
cfindd'usiness genericV)2.0) /"lnsVu"8uddi`
org8api`.2)O cna"eOBoo1sbc/na"eO
c/findd'usinessO
LBBI 2.0 is part of t&e %SI Basic $ro*le 1.1
:---ecurit(
%SSecurity is a standard released 'y 7ASIS in 5arc& 200. It descri'es securityrelated
en&ance"ents to S7A$ "essaging t&at pro.ide for "essage integrity and con*dentiality.
Integrity "eans t&at a S7A$ "essage is not ta"pered 0it& as it tra.els fro" a client to its
*nal destination. !on*dentiality "eans t&at a S7A$ "essage is only seen 'y intended
recipients. %SSecurity uses security to1ens to ena'le S7A$ "essage security and
integrity. A security to1en is a collection of clai"s "ade 'y t&e sender of a S7A$ "essage
+suc& as t&e senderKs identity-. A sender is aut&enticated 'y co"'ining a security to1en 0it&
a digital signature J t&e signature is used as proof t&at t&e sender is indeed associated 0it&
t&e security to1en. ,&e standard also pro.ides a generalpurpose "ec&anis" for associating
security to1ens 0it& "essages( and descri'es &o0 to encode 'inary security to1ens. %S
Security is Auite fle/i'le and can 'e used 0it& a 0ide .ariety of security "odels and
encryption tec&nologies( suc& as $u'lic1ey infrastructure +$II- and Ier'eros( as 0ell as t&e
Secure Soc1et Dayer +SSD-/ ,ransport Dayer Security +,DS- protocol t&at pro.ides 'asic endto
end security co""unication on t&e Internet.
E5ample"
+001- cE/"l .ersionV)1.0) encodingV)utf`R)EO
+002- cS8En.elope /"lns8SV)&ttp8//000.0?.org/2001/12/soap`en.elope)
/"lns8dsV)&ttp8//000.0?.org/2000/0S//"ldsige)O
+00?- cS8HeaderO
$age
?
+00- c"8pat& /"lns8"V)&ttp8//sc&e"as./"lsoap.org/rp/)O +005-
c"8actionO&ttp8//fa'ri1a"12?.co"/getXuotec/"8actionO +00#-
c"8toO&ttp8//fa'ri1a"12?.co"/stoc1sc/"8toO
+00Q- c"8idOuuid8R'Sf5d0`??f'`aR1`'02'`5'Q#0#1c1d#c/"8idO
+00R- c/"8pat&O
+00S- c0sse8Security
/"lns80sseV)&ttp8//sc&e"as./"lsoap.org/0s/2002/0/sece/t)O
+010- 0sse8Lserna"e,o1en IdV)5yIB)O
+011-
c0sse8Lserna"eOaoec/0sse8Lserna"eO +012-
c/0sse8Lserna"e,o1enO
+01?- cds8SignatureO +01-
cds8SignedInfoO
+015- cds8!anonicali4ation5et&od
Algorit&"V
)&ttp8//000.0?.org/2001/10//"l`e/c`c1ne)/O
+01#- cds8Signature5et&od
Algorit&"V )&ttp8//000.0?.org/2000/0S//"ldsige&"ac`
s&a1)/O
+01Q- cds8Reference LRIV)e5sgBody)O
+01R- cds8Bigest5et&od
Algorit&"V
)&ttp8//000.0?.org/2000/0S//"ldsiges&a1)/O
+01S- cds8Bigest6alueODyDsC0$i0$L...c/ds8Bigest6alueO
+020- c/ds8ReferenceO
+021- c/ds8SignedInfoO
+022- cds8Signature6alueOBY'c&"5gI...c/ds8Signature6alueO
+02?- cds8IeyInfoO
+02- c0sse8Security,o1enReferenceO
+025- c0sse8Reference LRIV)e5yIB)/O +02#-
c/0sse8Security,o1enReferenceO
+02Q- c/ds8IeyInfoO
+02R- c/ds8SignatureO
+02S- c/0sse8SecurityO +0?0-
c/S8HeaderO
+0?1- cS8Body IdV)5sgBody)O
+0?2- ctru8Stoc1Sy"'ol /"lns8truV)&ttp8//fa'ri1a"12?.co"/payloads)O XXX
c/tru8Stoc1Sy"'olO
+0??- c/S8BodyO
+0?- c/S8En.elopeO
,&e lines in t&e e/a"ple t&at illustrate %SSecurity en&ance"ents are8
< Dine +00S-. ,&is speci*es a cSecurityO &eader t&at contains security infor"ation for
an intended recei.er. ,&is ele"ent continues until line +02S-.
< Dines +010- to +012-. ,&ese lines specify a security to1en t&at is associated 0it& t&e
"essage. In t&is case( it de*nes t&e userna"e of t&e client using t&e
cLserna"e,o1enO ele"ent. ,&e security to1en can also 'e used to specify a pass0ord.
Ho0e.er in t&is e/a"ple itKs assu"ed t&at t&e recei.er 1no0s t&e pass0ord in ot&er
0ords( it is a s&ared secret.
< Dines +01?- to +02R- specify a digital signature. ,&is signature ensures t&e integrity of
t&e signed ele"ents +t&at t&ey arenKt "odi*ed-. ,&e signature follo0s t&e P5D
Signature specification. In t&is e/a"ple( t&e signature is 'ased on a 1ey generated
fro" t&e userKs pass0ord. ,ypically( stronger signing "ec&anis"s 0ould 'e used.
< Dines +01- to +021- descri'e t&e digital signature. Dine +015- speci*es &o0 to
canonicali4e +nor"ali4e- t&e data t&at is 'eing signed.
< Dines +01Q- to +020- select t&e ele"ents t&at are signed. Specifically( line +01Q-
indicates t&at t&e cS8BodyO ele"ent is signed. In t&is e/a"ple( only t&e "essage
'ody is signed. ,ypically( additional ele"ents of t&e "essage( suc& as parts of t&e
routing &eader( s&ould 'e included in t&e signature.
$age

< Dine +022- speci*es t&e signature .alue of t&e canonicali4ed for" of t&e data t&at is
'eing signed as defined in t&e P5D Signature speci*cation.
< Dines +02?- to +02Q- pro.ide a &int as to 0&ere to *nd t&e security to1en associated
0it& t&is signature. Specifically( lines +02- to +025- indicate t&at t&e security to1en
can 'e found at +pulled fro"- t&e speci*ed LRD.
$age
5
E-Commerce Infrastructure
Cluster of -ervers
A co"puter cluster is a group of lin1ed co"puters( 0or1ing toget&er closely t&us in "any
respects for"ing a single co"puter. ,&e co"ponents of a cluster are co""only( 'ut not
al0ays( connected to eac& ot&er t&roug& fast local area net0or1s.
!lusters are usually deployed to i"pro.e perfor"ance and/or a.aila'ility o.er t&at of a
single co"puter( 0&ile typically 'eing "uc& "ore [email protected] t&an single co"puters of
co"para'le speed or a.aila'ility.
Cluster Categori*ations
< Hig&a.aila'ility +HA- clusters
o Hig&a.aila'ility clusters +also 1no0n as
Cailo.er !lusters- are i"ple"ented
pri"arily for t&e purpose of i"pro.ing
t&e a.aila'ility of ser.ices t&at t&e
cluster pro.ides.
o ,&ey operate 'y &a.ing redundant
nodes( 0&ic& are t&en used to pro.ide
ser.ice 0&en syste" co"ponents fail.
o ,&e "ost co""on si4e for an HA
cluster is t0o nodes( 0&ic& is t&e
"ini"u" reAuire"ent to pro.ide
redundancy.
o HA cluster i"ple"entations atte"pt
to use redundancy of cluster
co"ponents to eli"inate single points
of failure.
o ,&ere are co""ercial i"ple"entations
of Hig& A.aila'ility clusters for "any
operating syste"s.
o 7ne suc& i"ple"entation is t&e Fridloc1
platfor" fro"
&ttp8//000.o'sidiandyna"ics.co".
o ,&e Dinu/HA project is one co""only used free soft0are HA pac1age for t&e
Dinu/ operating syste".
o ,&e Dander!luster fro" Dander Soft0are can run on %indo0s( Dinu/( and
LGIP platfor"s. < Doad'alancing clusters
o Doad'alancing is 0&en "ultiple co"puters are lin1ed toget&er to s&are
co"putational 0or1load or function as a single .irtual co"puter.
o Dogically( fro" t&e user side( t&ey are "ultiple "ac&ines( 'ut function as a single
.irtual "ac&ine.
o ReAuests initiated fro" t&e user are "anaged 'y( and distri'uted a"ong(
all t&e standalone co"puters to for" a cluster.
o ,&is results in 'alanced co"putational 0or1 a"ong di@erent "ac&ines(
i"pro.ing t&e perfor"ance of t&e cluster syste"s.
< !o"pute !lusters
o 7ften clusters are used pri"arily for co"putational purposes( rat&er t&an
&andling I7oriented operations suc& as 0e' ser.ice or data'ases.
o Cor instance( a cluster "ig&t support co"putational si"ulations of 0eat&er or
.e&icle cras&es. o ,&e pri"ary distinction 0it&in co"pute clusters is &o0
tig&tlycoupled t&e indi.idual nodes are.
o Cor instance( a single co"pute jo' "ay reAuire freAuent co""unication a"ong
nodes t&is i"plies t&at t&e cluster s&ares a dedicated net0or1( is densely
located( and pro'a'ly &as &o"ogenous nodes.
$age
#
o ,&is cluster design is usually referred to as Beo0ulf !luster.
o ,&e ot&er e/tre"e is 0&ere a co"pute jo' uses one or fe0 nodes( and needs
little or no internode co""unication.
o ,&is latter category is so"eti"es called )Frid) co"puting.
o ,ig&tlycoupled co"pute clusters are designed for 0or1 t&at "ig&t traditionally
&a.e 'een called )superco"puting).
o 5iddle0are suc& as 5$I +5essage $assing Interface- or $65 +$arallel 6irtual
5ac&ine- per"its co"pute clustering progra"s to 'e porta'le to a 0ide .ariety
of clusters.
< Frid !o"puting
o Frids are usually co"puter clusters( 'ut "ore focused on t&roug&put li1e a
co"puting utility rat&er t&an running fe0er( tig&tlycoupled jo's.
o 7ften( grids 0ill incorporate &eterogeneous collections of co"puters(
possi'ly distri'uted geograp&ically( so"eti"es ad"inistered 'y unrelated
organi4ations.
o Frid co"puting is opti"i4ed for 0or1loads 0&ic& consist of "any independent
jo's or pac1ets of 0or1( 0&ic& do not &a.e to s&are data 'et0een t&e jo's during
t&e co"putation process.
o Frids ser.e to "anage t&e allocation of jo's to co"puters 0&ic& 0ill
perfor" t&e 0or1 independently of t&e rest of t&e grid cluster.
o Resources suc& as storage "ay 'e s&ared 'y all t&e nodes( 'ut inter"ediate
results of one jo' do not a@ect ot&er jo's in progress on ot&er nodes of t&e grid.
o An e/a"ple of a .ery large grid is t&e Coldingf&o"e project.
o It is analy4ing data t&at is used 'y researc&ers to *nd cures for diseases suc&
as Al4&ei"erKs and cancer.
o Anot&er large project is t&e SE,If&o"e project( 0&ic& "ay 'e t&e largest
distri'uted grid in e/istence.
o It uses appro/i"ately t&ree "illion &o"e co"puters all o.er t&e 0orld to
analy4e data fro" t&e Areci'o 7'ser.atory radiotelescope( searc&ing for
e.idence of e/traterrestrial intelligence.
o In 'ot& of t&ese cases( t&ere is no internode co""unication or s&ared storage.
o Indi.idual nodes connect to a "ain( central location to retrie.e a s"all
processing jo'. o ,&ey t&en perfor" t&e co"putation and return t&e
result to t&e central ser.er.
o In t&e case of t&e f&o"e projects( t&e soft0are is generally run 0&en t&e
co"puter is ot&er0ise idle.
) cluster of servers allows servers to wor1 together as computer cluster, to
provide failover and increased availabilit( of applications, or parallel calculating
power in case of highperformance computing .<PC/ clusters .as in
supercomputing/.
A ser.er cluster is a group of independent ser.ers 0or1ing toget&er as a single syste" to
pro.ide &ig& a.aila'ility of ser.ices for clients. %&en a failure occurs on one co"puter in a
cluster( resources are redirected and t&e 0or1load is redistri'uted to anot&er co"puter in t&e
cluster.
Mou can use ser.er clusters to ensure t&at users &a.e constant access to i"portant ser.er
'ased resources. Ser.er clusters are designed for applications t&at &a.e longrunning in
"e"ory state or freAuently updated data. ,ypical uses for ser.er clusters include *le ser.ers(
print ser.ers( data'ase ser.ers( and "essaging ser.ers.
A cluster consists of t0o or "ore co"puters 0or1ing toget&er to pro.ide a &ig&er le.el of
a.aila'ility( relia'ility( and scala'ility t&an can 'e o'tained 'y using a single co"puter.
5icrosoft cluster tec&nologies guard against t&ree speci*c types of failure8
< Application and ser.ice failures( 0&ic& a@ect application soft0are and essential
ser.ices.
< Syste" and &ard0are failures( 0&ic& a@ect &ard0are co"ponents suc& as !$Ls(
dri.es( "e"ory( net0or1 adapters( and po0er supplies.
< Site failures in "ultisite organi4ations( 0&ic& can 'e caused 'y natural disasters(
po0er outages( or connecti.ity outages.
,&e a'ility to &andle failure allo0s ser.er clusters to "eet reAuire"ents for &ig& a.aila'ility(
0&ic& is t&e a'ility to pro.ide users 0it& access to a ser.ice for a &ig& percentage of ti"e
0&ile reducing unsc&eduled outages.
In a ser.er cluster( eac& ser.er o0ns and "anages its local de.ices and &as a copy of t&e
operating syste" and t&e applications or ser.ices t&at t&e cluster is "anaging. Be.ices
co""on to t&e cluster( suc& as dis1s in co""on dis1 arrays and t&e connection "edia for
accessing t&ose dis1s( are o0ned and "anaged 'y only one ser.er at a ti"e. Cor "ost ser.er
clusters( t&e application data is stored on dis1s in one of t&e co""on dis1 arrays( and
t&is data is accessi'le only to t&e ser.er t&at currently o0ns t&e corresponding application or
ser.ice.
Ser.er clusters are designed so t&at t&e ser.ers in t&e cluster 0or1 toget&er to protect data(
1eep applications and ser.ices running after failure on one of t&e ser.ers( and "aintain
consistency of t&e cluster con*guration o.er ti"e.
6(pes of -erver Clusters
Single Quorum Device Cluster
,&e "ost 0idely used cluster type is
t&e single Auoru" de.ice cluster( also
called t&e standard Auoru" cluster. In
t&is type of cluster t&ere are "ultiple
nodes 0it& one or "ore cluster dis1
arrays( also called t&e cluster
storage( and a connection de.ice( t&at
is( a 'us. Eac& dis1 in t&e array is
o0ned and "anaged 'y only one
ser.er at a ti"e. ,&e dis1 array
also contains t&e Auoru" resource.
,&e follo0ing *gure illustrates a single
Auoru" de.ice cluster 0it& one cluster
dis1 array. Because single Auoru" de.ice clusters are t&e "ost 0idely used cluster( t&is
,ec&nical Reference focuses on t&is type of cluster.
Majority Node Set Cluster
%indo0s Ser.er 200? supports anot&er type of
cluster( t&e "ajority node set cluster. In a "ajority
node set cluster( eac& node "aintains its o0n
copy of t&e cluster con*guration data. ,&e
Auoru" resource 1eeps con*guration data
consistent across t&e nodes. Cor t&is reason(
"ajority node set clusters can 'e used for
geograp&ically dispersed clusters. Anot&er
ad.antage of "ajority node set clusters is t&at a
Auoru" dis1 can 'e ta1en ogine for "aintenance
and t&e cluster as a 0&ole 0ill continue to operate.
,&e "ajor di@erence 'et0een "ajority node set clusters and single Auoru" de.ice clusters
is t&at single Auoru" de.ice clusters can operate 0it& just one node( 'ut "ajority node set
clusters need to &a.e a "ajority of t&e cluster nodes a.aila'le for t&e ser.er cluster to
operate. ,&e follo0ing figure illustrates a "ajority node set cluster. Cor t&e cluster in t&e
*gure to continue to operate( t0o of t&e t&ree cluster nodes +a "ajority- "ust 'e a.aila'le.
$age
R
Local Quorum Cluster
A local Auoru" cluster( also called a single node cluster( &as a single node and is often used for
testing. ,&e follo0ing *gure illustrates a local Auoru" cluster.
>irtuali*ation
Ser.er .irtuali4ation is t&e "as1ing of ser.er resources(
including t&e nu"'er and identity of indi.idual p&ysical
ser.ers( processors( and operating syste"s( fro" ser.er
users. ,&e ser.er ad"inistrator uses a soft0are application to
di.ide one p&ysical ser.er into "ultiple isolated .irtual
en.iron"ents. ,&e .irtual en.iron"ents are so"eti"es
called .irtual pri.ate ser.ers( 'ut t&ey are also 1no0n as guests(
instances( containers or e"ulations.
6irtuali4ation 0as in.ented "ore t&an t&irty years ago to allo0
large e/pensi.e "ainfra"es to 'e easily s&ared a"ong di@erent
application en.iron"ents. As &ard0are prices 0ent do0n( t&e
need for .irtuali4ation faded a0ay. 5ore recently(
.irtuali4ation at all le.els +syste"( storage( and net0or1-
'eca"e i"portant again as a 0ay to i"pro.e syste" security(
relia'ility and a.aila'ility( reduce costs( and pro.ide greater
3e/i'ility.
)pproaches to >irtuali*ation
,&ere are t&ree popular approac&es to ser.er .irtuali4ation8
1. ,&e .irtual "ac&ine "odel(
2. ,&e para.irtual "ac&ine "odel(
?. 6irtuali4ation at t&e operating syste" +7S- layer.
Virtual Machines
6irtual "ac&ines are 'ased on t&e &ost/guest paradig". Eac& guest runs on a .irtual
i"itation of t&e &ard0are layer. ,&is approac&
allo0s t&e guest operating syste" to run 0it&out "odi*cations. It also allo0s
t&e ad"inistrator to create guests t&at use di@erent operating syste"s. ,&e guest &as no
1no0ledge of t&e &ostKs operating syste" 'ecause it is not a0are t&at itKs not running on real
&ard0are.
It does( &o0e.er( reAuire real co"puting resources fro" t&e &ost so it uses a &yper.isor to
coordinate instructions to t&e !$L. ,&e &yper.isor is called a .irtual "ac&ine "onitor
+655-. It .alidates all t&e guestissued !$L instructions and "anages any e/ecuted code
t&at reAuires addition pri.ileges. 650are and 5icrosoft 6irtual Ser.er 'ot& use t&e .irtual
"ac&ine "odel.
$age
S
Paravirtual Machine
,&e para.irtual "ac&ine +$65- "odel is also 'ased on t&e &ost/guest paradig" and it
uses a .irtual "ac&ine "onitor too. In t&e para.irtual "ac&ine "odel( &o0e.er( ,&e 655
actually "odi*es t&e guest operating syste"Ks code. ,&is "odi*cation is called porting.
$orting supports t&e 655 so it can utili4e pri.ileged syste"s calls sparingly. Di1e .irtual
"ac&ines( para.irtual "ac&ines are capa'le of running "ultiple operating syste"s.
Virtuali!ation " #S Level
6irtuali4ation at t&e 7S le.el 0or1s a little di@erently. It isnKt 'ased on t&e &ost/guest
paradig". In t&e 7S le.el "odel( t&e &ost runs a single 7S 1ernel as its core and e/ports
operating syste" functionality to eac& of t&e guests. Fuests "ust use t&e sa"e operating
syste" as t&e &ost( alt&oug& di@erent distri'utions of t&e sa"e syste" are allo0ed.
,&is distri'uted arc&itecture eli"inates syste" calls 'et0een layers( 0&ic& reduces !$L
usage o.er&ead. It also reAuires t&at eac& partition re"ain strictly isolated fro" its
neig&'ors so t&at a failure or security 'reac& in one partition isnKt a'le to affect any of t&e
ot&er partitions.
In t&is "odel( co""on 'inaries and li'raries on t&e sa"e p&ysical "ac&ine can 'e s&ared(
allo0ing an 7S le.el .irtual ser.er to &ost t&ousands of guests at t&e sa"e ti"e.
>irtuali*ation 6echni2ues
#uest $perating -(stem >irtuali*ation
In t&is scenario t&e p&ysical &ost co"puter syste" runs a standard un"odi*ed operating
syste" suc& as %indo0s( Dinu/( Lni/ or 5ac7S P. Running on t&is operating syste" is a
.irtuali4ation application 0&ic& e/ecutes in "uc& t&e sa"e 0ay as any ot&er application suc&
as a 0ord processor or spreads&eet 0ould run on t&e syste". It is 0it&in t&is .irtuali4ation
application t&at one or "ore .irtual "ac&ines are created to run t&e guest operating
syste"s on t&e &ost co"puter.
,&e .irtuali4ation application is
responsi'le for starting( stopping and
"anaging eac&.irtual "ac&ine
and essentially controlling access
to p&ysical &ard0are resources on
'e&alf of t&e indi.idual .irtual
"ac&ines. ,&e .irtuali4ation
application also engages in a process
1no0n as 'inary re0riting 0&ic&
in.ol.es scanning t&e instruction
strea" of t&e e/ecuting guest syste"
and replacing any pri.ileged instructions
0it& safe e"ulations.
,&is &as t&e e@ect of "a1ing t&e guest
syste" t&in1 it is running directly on t&e
syste" &ard0are( rat&er t&an in a .irtual
"ac&ine 0it&in an application.
As outlined in t&e a'o.e diagra"( t&e guest operating syste"s operate in .irtual "ac&ines
0it&in t&e .irtuali4ation application 0&ic&( in turn( runs on top of t&e &ost operating syste"
in t&e sa"e 0ay as any ot&er application. !learly( t&e "ultiple layers of a'straction
'et0een t&e guest operating syste"s and t&e underlying &ost &ard0are are not conduci.e to
&ig& le.els of .irtual "ac&ine perfor"ance.
,&is tec&niAue does( &o0e.er( &a.e t&e ad.antage t&at no c&anges are necessary to eit&er
&ost or guest operating syste"s and no special !$L &ard0are .irtuali4ation support is
reAuired.
$age
50
Shared $ernel Virtuali!ation
S&ared 1ernel .irtuali4ation +also 1no0n as syste" le.el or operating syste" .irtuali4ation-
ta1es ad.antage of t&e arc&itectural design of Dinu/ and LGIP 'ased operating syste"s.
In order to understand &o0 s&ared 1ernel
.irtuali4ation 0or1s it &elps to *rst
understand t&e t0o
"ain co"ponents of Dinu/ or LGIP
operating syste"s. At t&e core of t&e
operating syste" is t&e 1ernel. ,&e 1ernel(
in si"ple ter"s( &andles all t&e
interactions 'et0een t&e operating
syste" and t&e p&ysical &ard0are. ,&e
second 1ey co"ponent is t&e root file
syste" 0&ic& contains all t&e li'raries(
*les and utilities necessary for t&e
operating syste" to function. Lnder
s&ared 1ernel .irtuali4ation t&e .irtual guest
syste"s eac& &a.e t&eir o0n root *le
syste" 'ut s&are t&e 1ernel of t&e &ost
operating syste". ,&is structure is
illustrated in t&e follo0ing arc&itectural
diagra".
,&is type of .irtuali4ation is "ade possi'le 'y t&e a'ility of t&e 1ernel to dyna"ically
c&ange t&e current root file syste" +a concept 1no0n as c&root- to a di@erent root *le
syste" 0it&out &a.ing to re'oot t&e entire syste". Essentially( s&ared 1ernel .irtuali4ation is
an e/tension of t&is capa'ility.
$er&aps t&e 'iggest single dra0'ac1 of t&is for" of .irtuali4ation is t&e fact t&at t&e guest
operating syste"s "ust 'e co"pati'le 0it& t&e .ersion of t&e 1ernel 0&ic& is 'eing s&ared. It
is not( for e/a"ple( possi'le to run 5icrosoft %indo0s as a guest on a Dinu/ syste" using t&e
s&ared 1ernel approac&. Gor is it possi'le for a Dinu/ guest syste" designed for t&e 2.#
.ersion of t&e 1ernel to s&are a 2. .ersion 1ernel.
$ernel Level Virtuali!ation
Lnder 1ernel le.el .irtuali4ation t&e &ost
operating syste" runs on a specially
"odified 1ernel 0&ic& contains e/tensions
designed to "anage and control "ultiple
.irtual "ac&ines eac& containing a guest
operating syste". Lnli1e s&ared 1ernel
.irtuali4ation eac& guest runs its o0n
1ernel( alt&oug& si"ilar restrictions apply
in t&at t&e guest operating syste"s "ust
&a.e 'een co"piled for t&e sa"e &ard0are
as t&e 1ernel in 0&ic& t&ey are running.
E/a"ples of 1ernel le.el .irtuali4ation
tec&nologies include Lser 5ode Dinu/
+L5D- and Iernel'ased 6irtual 5ac&ine
+I65-. ,&e follo0ing diagra" pro.ides an
o.er.ie0 of t&e 1ernel le.el .irtuali4ation
arc&itecture8
$age
51
Cloud Computing
!loud co"puting is Internet'ased co"puting( 0&ere'y s&ared resources( soft0are( and
infor"ation are pro.ided to computers and ot&er de.ices on de"and( li1e t&e electricity grid.
!loud computing is a paradig" shift follo0ing t&e s&ift fro" "ainfra"e to clientJser.er in
t&e early 1SR0s. Betails are a'stracted fro" the users( 0&o no longer &a.e need for
e/pertise in( or control o.er( t&e tec&nolog( infrastructure )in t&e cloud) t&at supports t&em.
!loud co"puting descri'es a ne0 supple"ent( consu"ption, and deli.ery "odel for I,
ser.ices 'ased on t&e Internet( and it typically in.ol.es o.ertheInternet pro.ision of
dyna"ically scala'le and often .irtuali4ed resources. It is a 'yproduct and conseAuence of
the easeofaccess to remote co"puting sites pro.ided 'y t&e Internet. ,&is freAuently
ta1es t&e for" of 0e''ased tools or applications that users can access and use t&rough a
0e' 'ro0ser as if it 0ere a progra" installed locally on t&eir o0n co"puter.
GIS6 pro.ides a so"e0&at "ore o'jecti.e and speci*c definition &ere. ,&e ter" )cloud) is
used as a "etap&or for t&e Internet( 'ased on t&e cloud dra0ing used in t&e past to
represent t&e telep&one net0or1 and later to depict t&e Internet in co"puter net0or1
diagra"s as an a'straction of t&e underlying infrastructure it represents.
,ypical cloud co"puting pro.iders deli.er co""on business applications online t&at are
accessed fro" anot&er %e' service or software li1e a :e' 'ro0ser, 0&ile t&e soft0are and
data are stored on servers. A 1ey ele"ent of cloud computing is custo"i4ation and t&e
creation of a userde*ned e5perience.
5ost cloud co"puting infrastructures
consist of services deli.ered t&rough
co""on centers and 'uilt on ser.ers.
!louds often appear as single points of
access for consu"ersK computing needs.
!o""ercial o@erings are generally
e5pected to "eet Auality of ser.ice
+Xo-- reAuirements of custo"ers( and
typicall( include ser.ice le.el
agree"ents +SDAs-.,&e "ajor cloud
ser.ice pro.iders include Salesforce(
A"a4on and Foogle. So"e of t&e larger
I, *r"s t&at are actively in.ol.ed in
cloud co"puting are 5icrosoft( Hewlett
$ac1ard and IB5.
!loud co"puting deri.es c&aracteristics fro"( 'ut should not 'e confused 0it&8
1. Autonomic co"puting T )co"puter syste"s capa'le of self"anage"ent)
2. !lientJser.er "odel J clientJser.er computing refers 'roadl( to any distri'uted
application t&at distinguis&es 'et0een ser.ice pro.iders +ser.ers- and ser.ice
reAuesters +clients-.
?. Frid computing T Ia for" of distri'uted co"puting and parallel co"puting( w&ere'y
a Ksuper and .irtual co"puterK is co"posed of a cluster of networ1ed( loosely coupled
co"puters acting in concert to perfor" .ery large tas1s)
. 5ainfra"e co"puter T po0erful co"puters used "ainly 'y large organi4ations for
critical applications( typicall( 'ul1 dataprocessing suc& as census( industry and
consu"er statistics( enterprise resource planning( and *nancial transaction processing.
5. Ltility co"puting T t&e )pac1aging of co"puting resources( suc& as co"putation and
storage( as a "etered ser.ice si"ilar to a traditional pu'lic utility( suc& as electricity)=
#. $eerto peer J a distri'uted arc&itecture 0it&out t&e need for central coordination(
0it& participants being at t&e sa"e ti"e bot& suppliers and consu"ers of resources
+in contrast to t&e traditional clientJser.er "odel-
Characteristics
In general( cloud co"puting custo"ers do not o0n t&e p&ysical infrastructure( instead
a.oiding capital e/penditure 'y renting usage fro" a t&irdparty pro.ider. ,&ey consu"e
resources as a ser.ice and pay only for resources t&at t&e( use. 5an( cloudcomputing
offerings e"ploy t&e utility co"puting "odel( w&ic& is analogous to &o0 traditional
utilit( ser.ices +suc& as electricity- are consu"ed( 0&ereas ot&ers 'ill on a subscription 'asis.
S&aring )peris&a'le and intangi'le) co"puting po0er a"ong "ultiple tenants can
i"pro.e utili4ation rates( as servers are not unnecessarily left idle +w&ic& can reduce costs
signi*cantly w&ile increasing t&e speed of application de.elop"ent-.
A sidee@ect of t&is approac& is t&at o.erall co"puter usage rises dra"atically( as customers
do not &a.e to engineer for pea1 load limits. In addition( )increased &ig&speed 'and0idt&)
"a1es it possi'le to recei.e t&e sa"e. ,&e cloud is 'eco"ing increasingly associated 0ith
s"all and "ediu" enterprises +S,Es- as in "any cases t&ey cannot justify or afford t&e
large capital e5penditure of traditional I,.
S5Es also typically &a.e less e/isting infrastructure( less 'ureaucracy( "ore fle/i'ility( and
s"aller capital 'udgets for purc&asing in&ouse tec&nology. Si"ilarly( S5Es in e"erging
"ar1ets are typically un'urdened 'y esta'lis&ed legacy infrastructures( t&us reducing t&e
co"ple/ity of deploying cloud solutions.
)rchitecture
!loud arc&itecture( t&e syste"s arc&itecture of t&e soft0are s(ste"s in.ol.ed in t&e
deli.ery of cloud co"puting( typically in.ol.es "ultiple cloud co"ponents com"unicating
0it& each ot&er o.er application progra""ing interfaces( usually 0e' ser.ices.
,&is rese"'les t&e Lni/ philosop&y
of &a.ing "ultiple progra"s eac&
doing one t&ing 0ell and 0or1ing
toget&er o.er uni.ersal interfaces.
!omple/ity is controlled and t&e
resulting syste"s are more
"anagea'le t&an t&eir "onolit&ic
counterparts.
,&e t0o "ost significant co"ponents
of cloud computing arc&itecture are
1no0n as the front end and t&e 'ac1
end. ,&e front end is t&e part seen 'y
t&e client( i.e. t&e co"puter user.
,&is includes t&e client9s net0or1 +or
co"puter- and t&e applications used
to access t&e cloud .ia a user
interface suc& as a 0eb 'ro0ser.
,&e 'ac1 end of t&e cloud co"puting arc&itecture is t&e Ucloud9 itself( comprising .arious
co"puters( ser.ers and data storage devices.
0e( Features
< )gilit( i"pro.es 0it& usersK ability to rapidly and ine/pensi.ely repro.ision
tec&nological infrastructure resources.
< Cost is clai"ed to 'e greatly reduced and capital e/penditure is con.erted to
operational e/penditure. ,&is ostensibly lo0ers 'arriers to entry( as infrastructure is
typically provided 'y a thirdparty and does not need to 'e purc&ased for oneti"e or
infreAuent intensi.e co"puting tas1s. $ricing on a utility co"puting 'asis is *ne
grained 0it& usage'ased options and fe0er I, s1ills are reAuired for i"plementation
+in&ouse-.
< evice and location independence ena'le users to access syste"s using a 0e'
'ro0ser regardless of t&eir location or 0&at de.ice t&ey are using +e.g.( $!( "o'ile-. As
infrastructure is offsite +typically pro.ided 'y a t&irdparty- and accessed .ia t&e
Internet( users can connect fro" any0&ere.
< ,ultitenanc( ena'les s&aring of resources and costs across a large pool of users t&us
allo0ing for8
o !entrali4ation of infrastructure in locations 0it& lo0er costs +suc& as real
estate( electricity( etc.- o $ea1load capacity increases +users need not engineer
for &ig&est possi'le loadle.els-
o Ltili4ation and e2ciency i"pro.e"ents for syste"s t&at are often only 10J20b
utili4ed.
< !eliabilit( is i"pro.ed if "ultiple redundant sites are used( 0&ic& "a1es 0ell
designed cloud co"puting suita'le for 'usiness continuity and disaster reco.ery.
\?R] Gonet&eless( "any "ajor cloud co"puting ser.ices &a.e su@ered outages( and
I, and 'usiness "anagers can at ti"es do little 0&en t&ey are affected.
< -calabilit( .ia dyna"ic +)onde"and)- pro.isioning of resources on a *negrained(
selfser.ice 'asis near realti"e( 0it&out users &a.ing to engineer for pea1 loads.
$erfor"ance is "onitored( and consistent and loosely coupled arc&itectures are
constructed using 0e' ser.ices as t&e syste" interface. 7ne of t&e "ost i"portant
ne0 "et&ods for o.erco"ing perfor"ance 'ottlenec1s for a large class of
applications is data parallel progra""ing on a distri'uted data grid.
< -ecurit( could i"pro.e due to centrali4ation of data( increased securityfocused
resources( etc.( 'ut concerns can persist a'out loss of control o.er certain sensiti.e
data( and t&e lac1 of security for stored 1ernels. Security is often as good as or 'etter
t&an under traditional syste"s( in part 'ecause pro.iders are a'le to de.ote resources
to sol.ing security issues t&at "any custo"ers cannot afford. $ro.iders typically log
accesses( 'ut accessing t&e audit logs t&e"sel.es can 'e difficult or i"possi'le.
Curt&er"ore( t&e co"ple/ity of security is greatly increased 0&en data is distri'uted
o.er a 0ider area and / or nu"'er of de.ices.
< ,aintenance of cloud co"puting applications is easier( since t&ey donKt &a.e to 'e
installed on eac& userKs co"puter. ,&ey are easier to support and to i"pro.e since t&e
c&anges reac& t&e clients instantly.
< ,etering "eans t&at cloud co"puting resources usage s&ould 'e "easura'le and
s&ould 'e "etered per client and application on a daily( 0ee1ly( "ont&ly( and yearly
'asis.
Cloud computing pro.ides t&e "eans t&roug& 0&ic& e.eryt&ing T fro" co"puting
po0er to co"puting infrastructure( applications( 'usiness processes to personal
colla'oration T can 'e deli.ered to you as a ser.ice 0&ere.er and 0&ene.er you need.
eplo(ment ,odels
!loud co"puting is o@ered in
di@erent for"s8 < Public
clouds
o $u'lic cloud or e/ternal cloud descri'es cloud co"puting in t&e traditional
"ainstrea" sense( 0&ere'y resources are dyna"ically pro.isioned on a
*negrained( selfser.ice 'asis o.er t&e Internet( .ia 0e' applications/0e'
ser.ices( fro" an o@site t&irdparty pro.ider 0&o 'ills on a *ne grained utility
co"puting 'asis.
< Communit( Clouds
o A co""unity cloud "ay 'e esta'lis&ed 0&ere se.eral organi4ations &a.e si"ilar
reAuire"ents and see1 to s&are infrastructure so as to reali4e so"e of t&e
'ene*ts of cloud co"puting. %it& t&e costs spread o.er fe0er users t&an a
pu'lic cloud +'ut "ore t&an a single tenant- t&is option is "ore e/pensi.e 'ut
"ay offer a &ig&er le.el of pri.acy( security and/or policy co"pliance.
< $ri.ate
clouds <
Hy'rid clouds
o A &y'rid cloud en.iron"ent consisting of "ultiple internal and/or e/ternal
pro.iders )0ill 'e typical for "ost enterprises). By integrating "ultiple cloud
ser.ices users "ay 'e a'le to ease t&e transition to pu'lic cloud ser.ices.
Cloud Computing 'a(ers
Infrastructure As A Service %IAAS&
IaaS is t&e deli.ery of co"puter &ard0are +ser.ers( net0or1ing tec&nology( storage( and
data centre space- as a ser.ice. Mou also can e/pect it to include t&e deli.ery of operating
syste"s and .irtuali4ation tec&nology to "anage t&e resources. ,&e IaaS custo"ers rent
co"puting resources instead of 'uying and installing t&e" in t&eir o0n data centre.
!onsu"ers control and "anage t&e syste"s in ter"s of t&e operating syste"s(
applications( storage( and net0or1 connecti.ity( 'ut do not t&e"sel.es control t&e cloud
infrastructure.
< 5uc& of t&e IaaS "ar1et 0ill li1ely follo0 in t&e pat& of t&e IS$ "ar1et. ,&e IS$ "odel
&as 'een pro.en and so"e large IS$s +FoBaddy at 000.godaddy. co" and in5otion
&osting at 000.in"otion&osting.co"- run "illions of %e' sites.
< ,&e IS$ ser.ice is typically paid for 'ased on t&e a"ount of resources used o.er
ti"e. ,&is can include dyna"ic scaling so t&at if "ore resources are reAuired t&an
e/pected( t&ey 0ill 'e pro.ided i""ediately +up to a gi.en li"it-.
< ,&e arrange"ent in.ol.es an agreedupon ser.ice le.el T nor"ally SS.S percent
a.aila'ility or 'etter( 0it& li"its set on !$L usage( "e"ory( dis1 space( and Internet
'and0idt&. Go one 0ill o'ject if you 0ant to rent a ser.er or a .irtual ser.er fro" an IS$
and you run a data "art +instead of running a %e' site-.
< Got&ing in t&e custo"er agree"ent stops you fro" using t&e resources in t&at 0ay. It
0ouldn9t "a1e "uc& sense( &o0e.er( 'ecause you pro'a'ly 0ouldn9t get t&e ser.ice
le.el agree"ent you 0anted or t&e support you needed. IaaS ta1es t&e IS$ "odel to a
ne0 le.el.
Platform As A Service %PAAS&
$aaS &as "any de*nitions( 0e9d li1e you to t&in1 a'out it as a co"puting platfor"
t&at includes a set of de.elop"ent( "iddle0are( and deploy"ent capa'ilities. A 1ey .endor
c&aracteristic is creating and encouraging a deep ecosyste" of partners 0&o all co""it to
t&is en.iron"ent for t&e future.
%&en organi4ations are loo1ing for capacity on de"and( t&ey often loo1 to Infrastructure
as a Ser.ice +IaaS-. Ho0e.er( 0&en an organi4ation is loo1ing for a deeper set of capa'ilities(
t&ey loo1 at $latfor" as a Ser.ice +$aaS-.
< $aaS &as to le.erage t&e Internet.
< $aaS "ust offer so"e type of de.elop"ent language so professional de.elopers +and
in so"e cases users- can add .alue.
< ,&ese en.iron"ents need a 0ay to "onitor and "easure resource use and to trac1
o.erall perfor"ance of t&e .endor9s platfor".
< Al"ost all $aaS platfor"s are 'ased on a "ultitenancy arc&itecture +0&ic& lets
"ultiple clients run t&eir copy separately fro" eac& ot&er t&roug& .irtuali4ation- so
t&at eac& custo"er9s code or data is isolated fro" ot&ers.
< A $aaS en.iron"ent needs to support t&e de.elop"ent lifecycle and t&e tea"
de.elop"ent process( including testing.
< A $aaS platfor" needs to include ser.ices interfaces suc& as S7A$ +Si"ple 7'ject
Access $rotocol- and P5D +ePtensi'le 5ar1up Danguage-( a"ong ot&ers.
< A $aaS platfor" "ust 'e a'le to deploy( "anage( test( and "aintain t&e de.eloped
applications.
< A $aaS platfor" "ust support 0elldefined and 0elldocu"ented interfaces so
ele"ents and co"ponents can 'e used in t&e follo0ing8
o !o"posite applications are created 'y co"'ining ser.ices to create an
enterprise application 'ased on orc&estration of 'usiness logic and rules.
o $ortals( 0&ic& are an organi4ed en.iron"ent t&at organi4es application
co"ponents for t&e custo"er.
o 5as&ups( 0&ic& let end users easily 'ring toget&er t0o or "ore 'usiness
ser.ices t&at can co""unicate and e/c&ange data.
Soft'are As A Service %SAAS&
5ainfra"e syste"s 0ere si"ply too e/pensi.e for "ost co"panies to 'uy t&eir o0n syste"s.
A couple of decades later( "inico"puters( ser.ers( and personal co"puters c&anged t&e
dyna"ics of t&e "ar1et. Econo"ically( it 0as feasi'le for any ,o"( Bic1( and Harriet to o0n
t&eir o0n syste"s and t&e soft0are. Got all soft0are "o.ed to an internal "odel &o0e.er.
+Soft0are suc& as AB$9s payroll syste"( for e/a"ple( re"ained Soft0are as a Ser.ice.-
,0o 1ey e.ents con.erged to create t&e "odel t&at 0e no0 call Soft0are as a Ser.ice +SaaS-8
< Cirst( t&e Internet 'eca"e a co""ercial platfor".
< Second( soft0are costs and co"ple/ities 'eca"e so dif*cult t&at running( upgrading(
and "anaging soft0are 'eco"e too co"ple/ for "any co"panies to "anage.
o ,&is 0as especially true for s"all and "ediu" si4ed co"panies t&at didn9t
0ant t&e e/penses of "anaging all t&e co"ponents. ,&ese co"panies 0ere t&e
*rst to e"'race t&is ne0 generation of SaaS.
!&aracteri4ing Soft0are as a Ser.ice8
%&at c&aracteristics &a.e to 'e in place for a SaaS to 'e co""ercially .ia'leE Here9s 0&at 0e
t&in1 is necessary8
< 6he -aa- application needs to be generali*ed enough so that lots of
customers will be interested in the service. Here are so"e e/a"ples of t&ese
types of applications8 accounting( colla'oration( project "anage"ent( testing(
analytics( content "anage"ent( Internet "ar1eting( ris1 "anage"ent and of course(
!R5. %&at doesn9t 0or1 as SaaSE A speciali4ed one of a1ind application 0it& a s"all
nu"'er of potential custo"ers.
< -aa- applications need sophisticated navigation and ease of use. If an SaaS
application isn9t easy to use( custo"ers 0ill si"ply stop su'scri'ing. 5ost SaaS
.endors o@er prospecti.e custo"ers a free trial for a "ont& or so. If t&e custo"er
doesn9t start using t&e application during t&at first "ont&( it9s li1ely t&at t&e custo"er
0on9t sign a contract. ,&is is really i"portant 'ecause it &as 'een reported t&at
less t&an 20 percent of users re"ain custo"ers after t&e first "ont& or so.
< 6he -aa- application needs be modular and service oriented. %it&out t&is
"odular approac&( it 0ill 'e &ard to c&ange and difficult to &a.e t&irdparty independent
co"panies join t&e ecosyste".
< )n -aa- application needs to include measuring and monitoring so customers
can be charged actual usage.
< )n -aa- application must have a builtin billing service.
< -aa- applications need published interfaces and an ecos(stem of partners
who can e5pand the compan(Js customer base and mar1et reach.
< -aa- applications have to ensure that each customerJs data and speciali*ed
con%gurations are separate and secure from other customersJ data and
con%gurations.
< -aa- applications need to provide sophisticated business process
con%gurators for customers. Eac& custo"er can c&ange t&e process 0it&in t&e
standardi4ed SaaS application. Cor e/a"ple( a co"pany "ig&t 0ant to add a process
so a "anager &as to appro.e t&e price 'eing offered to a ne0 custo"er. A 'uiltin
con*guration tool ena'les t&is to 'e done on an ad &oc 'asis 0it&out progra""ing.
< -aa- applications need to constantl( provide fast releases of new features
and new capabilities. ,&is "ust 'e done 0it&out i"pacting t&e custo"er9s a'ility to
continue 'usiness as usual.
< -aa- applications have to protect the integrit( of customer data.
< ,&at includes pro.iding tec&niAues for allo0ing data to "igrate eit&er to a pri.ate
data'ase inside t&e *re0all or to a t&irdparty storage capa'ility.
-erver Consolidation using Cloud
Ser.er consolidation is an approac& to t&e e2cient usage of co"puter ser.er resources in
order to reduce t&e total nu"'er of ser.ers or ser.er locations t&at an organi4ation
reAuires. ,&e practice de.eloped in response to t&e pro'le" of ser.er spra0l( a situation
in 0&ic& "ultiple( underutili4ed ser.ers ta1e up "ore space and consu"e "ore resources
t&an can 'e justi*ed 'y t&eir 0or1load.
< Ser.ers in "any co"panies typically run at 1520b of t&eir capacity(
o 0&ic& "ay not 'e a sustaina'le ratio in t&e current econo"ic en.iron"ent.
< Businesses are increasingly turning to ser.er consolidation as one "eans of cutting
unnecessary costs and "a/i"i4ing return on in.est"ent +R7I- in t&e data center.
o 7f 51R respondents in a Fartner Froup researc& study( si/ percent &ad
conducted a ser.er consolidation project(
o #1b 0ere currently conducting one( and
o 2Rb 0ere planning to do so in t&e i""ediate future.
Alt&oug& consolidation can su'stantially increase t&e e2cient use of ser.er resources( it "ay
also result in co"ple/ con*gurations of data( applications( and ser.ers t&at can 'e
confusing for t&e a.erage user to contend 0it&. ,o alle.iate t&is pro'le"( ser.er
.irtuali4ation "ay 'e used to "as1 t&e details of ser.er resources fro" users 0&ile
opti"i4ing resource s&aring. Anot&er approac& to ser.er consolidation is t&e use of 'lade
ser.ers to "a/i"i4e t&e e2cient use of space.
,&ere are four t&ings to consider 0&en loo1ing at ser.er consolidation8 &ard0are(
redundancy( operating syste"( and "a/i"i4ing e@iciency.
!loud co"puting can 'e used for Ser.er !onsolidation 'y "o.ing applications suc& as
E/c&ange 5ail( Bata Storage etc. to a pu'lic cloud.
+ote" !elate all the cloud la(ers and deplo(ment models to server consolidation and
e5plain the applications that can be housed remotel( on a cloud to minimi*e the
number, redundanc( and underutili*ation of servers.
Cor "any co"panies( t&e concept of ser.er consolidation connotes s0eeping projects t&at
replace nu"erous s"aller ser.ers 0it& po0erful( e/pensi.e 0or1&orses. 5any
organi4ations &a.e pursued t&is approac& and reali4ed signi*cant cost sa.ings and
i"pro.ed e2ciency. Ac&ie.ing t&e 'ene*ts of ser.er consolidation( &o0e.er( does not
necessarily reAuire a largescale e@ort t&at is 'ot& e/tensi.e and e/pensi.e in ter"s of a
lengt&y planning cycle and upfront in.est"ent. 5any organi4ations consolidate as part of
t&e natural refres& cycle of t&eir tec&nology infrastructures. ,&ese efforts are "ore
iterati.e processes( rat&er t&an oneti"e projects( t&at focus on reducing t&e nu"'er of
datacenters as t&e tec&nology infrastructure is updated and aug"ented. ,&e results are still
signi*cant. By consolidating( organi4ations are a'le to 'oost e@iciency and i"pro.e t&eir
ser.ice capa'ilities 0&ile "aintaining an infrastructure t&at is 'ot& ro'ust( &ig&ly a.aila'le(
and adapti.e to c&anging 'usiness reAuire"ents. Bell is introducing an e/panded progra"
ai"ed at consolidating 5icrosoft'ased en.iron"ents to &elp custo"ers reduce t&eir total
costs and gain greater control o.er t&eir I, infrastructure. ,&e co"pany &opes to use its
ne0 Ser.er !onsolidation R7I Analyst ,ool( latestgeneration enterprise ser.ers( syste"s
"anage"ent soft0are( and ser.ices to &elp organi4ations reali4e t&e 'ene*ts of consolidation
as t&ey rene0 t&eir tec&nology infrastructures.
rivers & Bene%ts of -erver Consolidation
A good deal of t&e i"petus for consolidation ste"s fro" t&e proliferation of distri'uted
syste"s t&at occurred during t&e 1SR0s and 1SS0s. As t&e price/perfor"ance and
functionality of s"all and "idrange syste"s i"pro.ed dra"atically( "any organi4ations
"o.ed to0ard a "odel of deploying distri'uted ser.ers to support depart"ental processes.
%&ile suc& syste"s allo0ed organi4ations relati.ely c&eap and easy access to po0erful
depart"ental applications( t&ey e.entually led to a Auag"ire of syste"s t&at 0ere
inco"pati'le( una'le to s&are infor"ation( and di2cult or i"possi'le to "anage consistently.
7rgani4ations also faced t&e c&allenge of "eeting increased le.els of a.aila'ility and security.
At t&e sa"e ti"e( "any co"panies loo1ed to standardi4e t&eir en.iron"ents on a
consolidated set of products fro" a "ore li"ited nu"'er of .endors in order to reali4e not
only cost sa.ings 'ut also procure"ent( ser.ice( and "anage"ent e2ciencies.
At t&e sa"e ti"e( "any I, organi4ations 'eca"e "ore ser.ice oriented. ,&eir focus
s&ifted fro" "anaging processes and tec&nology to pro.iding infor"ation access to a
'road set of user constituencies. Lsers 'egan to e/pect uni.ersal access to applications and
infor"ation.
!o"panies found t&at t&e duplication of ser.ers( applications( and data'ases not only 0as
e/pensi.e to "aintain 'ut also 1ept t&e" fro" utili4ing t&eir resources and infor"ation to t&e
"a/i"u" e/tent.
,&e response &as 'een a trend to0ard recentrali4ation and consolidation of I, resources
in t&e datacenter. !onsolidation allo0s co"panies to i"pro.e o.erall 'usiness processing
t&roug& t&ree pri"ary I, o'jecti.es8 a &ig&er and "ore consistent le.el of ser.ice( greater
e2ciency and control o.er operations( and t&e 3e/i'ility to respond to constantly c&anging
'usiness reAuire"ents +see Cigure 1-.
A!HIE6IGF HIFHER DE6EDS
7C SER6I!E
$ro.iding &ig& le.els of ser.ice
is on e.ery I, "anagerKs "ind
as t&e user 'ase e/pands 'eyond
internal users to custo"ers(
suppliers( students(
patients( ot&er go.ern"ent
agencies( and 'usiness
partners.
E.eryone e/pects &ig& le.els of
application and infor"ation
a.aila'ility and consistent
response ti"es. !onsolidation
pro.ides a consistent
"anage"ent fra"e0or1( 0&ic&
can lead to a "ore predicta'le
and consistent le.el of ser.ice.
ECCI!IEG!M AGB !7G,R7D
,&e 1ey to gaining e@iciency is in 'etter utili4ing t&e a.aila'le I, resources. !onsolidating t&e
ser.er en.iron"ent can lead to a "ore disciplined approac& to "anage"ent. As "uc& as 55b
of I, costs are associated 0it& personnel( and( in a consolidated en.iron"ent( t&e producti.ity
of ad"inistrati.e personnel increases greatly.
So"e organi4ations "ay reali4e t&e 'ene*t in reduced operational costs. IB! studies &a.e
s&o0n a Q81 cost sa.ings in people "anage"ent resources 0&en processes and resources
are consolidated. 7t&ers 0ill 'enefit fro" t&e a'ility to focus &ig&ly s1illed resources on
&ig&er.alue tas1s. ,ig&ter "anage"ent control 0ill also
en&ance a.aila'ility( security( and t&e a'ility to audit t&e consu"ption of ser.ices and c&arge
'ac1 appropriately.
Cor I, "anagers 0&o consistently cite floor space and po0er consu"ption as 1ey
issues( t&e 'ene*ts of consolidation are o'.ious. 5any organi4ations are loo1ing to reduce
not only t&e nu"'er of ser.ers 'ut t&e nu"'er of datacenters. Because t&e cost of net0or1
'and0idt& &as dropped dra"atically( co"panies no longer need to deploy ser.ers as close
as possi'le to t&e users t&ey support. ,&erefore( "any organi4ations can i"ple"ent s"aller(
regional datacenters( instead of a datacenter at eac& location.
As ser.er arc&itectures gro0 increasingly "odular( 0it& t&e deploy"ent of ultradense
solutions suc& as ser.er 'lades( t&e costs associated 0it& po0er( space( cooling( and ca'ling
0ill continue to drop.
FREA,ER 7RFAGIaA,I7GAD CDEPIBIDI,M
,&e greatest strategic .alue t&at co"panies gain fro" consolidation is an i"pro.ed a'ility
to e2ciently adapt t&e infrastructure to incorporate ne0 tec&nologies and respond to
ne0 'usiness reAuire"ents. All types
of organi4ations T 'usiness( go.ern"ent( "edical pro.iders( and educational institutions T
are dri.en 'y t&e need for &ig& le.els of ser.ices and infor"ation a.aila'ility. %it& a
consistent fra"e0or1 for "anaging data( I, spends less ti"e "o.ing data and transfor"ing
it into a usa'le for" and( t&erefore( can respond "ore Auic1ly to de"ands for data
a.aila'ility. ,&e infrastructure is t&en 'etter a'le to adapt as t&e organi4ation "o.es
for0ard. 7nce t&e I, center &as co"pleted t&e consolidation process and ac&ie.ed t&e
desired en.iron"ent( users can e/pect "ore rapid deploy"ent of ne0 applications and
features( leading to greater 3e/i'ility to respond to c&anging de"ands.
6KPE- $F C$+-$'I)6I$+
!onsolidation of t&e ser.er en.iron"ent can occur at "any di@erent le.els. 5ost
organi4ations pursue di@erent le.els of consolidation at di@erent ti"es( depending on t&eir
particular reAuire"ents. !onsolidation also tends to occur o.er ti"e( as an iterati.e process(
"o.ing t&roug& t&e di@erent types of consolidation.
,&e .arious types of consolidation s&ould 'e .ie0ed as steps along a continuu". ,&e
degrees of consolidation can 'e descri'ed in four general categories +see Cigure 2-8
hDogical consolidation reduces t&e nu"'er of points of control in t&e en.iron"ent to a
single ad"inistrati.e strea" and reduces t&e nu"'er of consoles. ,&e ser.ers re"ain
dispersed( 0&ile local operations are reduced or eli"inated and "anage"ent functions
suc& as 'ac1up( restore( reco.ery( "aintenance( and user support are perfor"ed
re"otely. A "ajor 'enefit of logical consolidation is a reduction in operational &eadcount(
or "ore e2cient use of t&e s1ills already on &and. Lsing fe0er people and a consistent set
of products and processes can &elp reduce 'ot& t&e dilution of s1ills across t&e organi4ation
and t&e opportunity for errors. ,&is approac& reduces t&e cost of "aintaining t&e en.iron"ent
and i"pro.es ser.ice to users.
h$&ysical consolidation in.ol.es consolidating co"ponents of t&e I, en.iron"ent in one
p&ysical location. ,&is colocation leads to greater e2ciency 'y eli"inating t&e replication
of s1ill sets across di@erent locations. %&en syste"s are in a central location( net0or1ing
'eco"es "uc& easier and "ore e2cient( po0er costs are reduced( 'ac1up can 'e
perfor"ed "ore ef*ciently( and security can 'e increased. ,&ere are also su'seAuent sa.ings
in floor space costs. !onsolidating locations also pro.ides opportunity for con*guring syste"s
for &ig&er a.aila'ility.
h%or1load consolidation reduces t&e nu"'er and .ariety of co"ponents in t&e en.iron"ent.
It in.ol.es not only ser.ers 'ut also ot&er p&ysical ele"ents suc& as tapes( dis1s( net0or1
de.ices and connections( soft0are( operating syste"s( and perip&erals. ,&e nu"'er of
processes and procedures are also reduced. %it& fe0er &ard0are and soft0are standards
to "anage( I, depart"ents can "ore easily "o.e and c&ange syste"s( applications(
and perip&erals. Get0or1 perfor"ance is en&anced e.en furt&er( as are security and
a.aila'ility. !o""on operating
syste" en.iron"ents allo0 "ore applications to s&are t&e sa"e ser.er( ta1ing ad.antage
of faster processors. %or1load consolidation can 'e organi4ed 'y application type( 'y
operating syste" type( or 'y line of 'usiness.
h,ransparent consolidation +including storage- in.ol.es pulling toget&er a nu"'er of I,
centers across a ca"pus or net0or1 and i"ple"enting storage area net0or1s to create a
single set of resources. ,&ese en.iron"ents 0ill 'e &ig&ly auto"ated( and t&eir gro0t& 0ill
'e dri.en 'y t&e a.aila'ility of &ig&speed( lo0cost data net0or1s.
#oogle )pps Engine
Foogle App Engine is a 0e' application &osting ser.ice. By :0e' application(; 0e "ean an
application or ser.ice accessed o.er t&e %e'( usually 0it& a 0e' 'ro0ser8 storefronts
0it& s&opping carts( social net0or1ing sites( "ultiplayer ga"es( "o'ile applications( sur.ey
applications( project "anage"ent( colla'oration( pu'lis&ing( and all of t&e ot&er t&ings 0e9re
disco.ering are good uses for t&e %e'. App Engine can ser.e traditional 0e'site content too(
suc& as docu"ents and i"ages( 'ut t&e en.iron"ent is especially designed for realti"e
dyna"ic applications.
In particular( Foogle App Engine is designed to &ost applications 0it& "any si"ultaneous
users. %&en an application can ser.e "any si"ultaneous users 0it&out degrading
perfor"ance( 0e say it scales. Applications 0ritten for App Engine scale auto"atically. As
"ore people use t&e application( App Engine allocates "ore resources for t&e application
and "anages t&e use of t&ose resources. ,&e application itself does not need to 1no0
anyt&ing a'out t&e resources it is using.
Lnli1e traditional 0e' &osting or self"anaged ser.ers( 0it& Foogle App Engine( you only pay
for t&e resources you use. ,&ese resources are "easured do0n to t&e giga'yte( 0it& no
"ont&ly fees or upfront c&arges. Billed resources include !$L usage( storage per "ont&(
inco"ing and outgoing 'and0idt&( and se.eral resources specific to App Engine ser.ices.
,o &elp you get started( e.ery de.eloper gets a certain a"ount of resources for free( enoug&
for s"all applications 0it& lo0 tra2c. Foogle esti"ates t&at 0it& t&e free resources( an app
can acco""odate a'out 5 "illion page .ie0s a "ont&.
App Engine can 'e descri'ed as t&ree parts8 t&e runtime environment( t&e datastore( and
t&e scalable services. In t&is c&apter( 0e9ll loo1 at eac& of t&ese parts at a &ig& le.el.
6he !untime Environment
An App Engine application responds to 0e' reAuests. A 0e' reAuest 'egins 0&en a client(
typically a user9s 0e' 'ro0ser( contacts t&e application 0it& an H,,$ reAuest( suc& as to
fetc& a 0e' page at a LRD. %&en App Engine recei.es t&e reAuest( it identi*es t&e
application fro" t&e do"ain na"e of t&e address( eit&er an .appspot.co" su'do"ain
+pro.ided for free 0it& e.ery app- or a su'do"ain of a custo" do"ain na"e you &a.e
registered and set up 0it& Foogle Apps. App Engine selects a ser.er fro" "any possi'le
ser.ers to &andle t&e reAuest( "a1ing its selection 'ased on 0&ic& ser.er is "ost li1ely to
pro.ide a fast response. It t&en calls t&e application 0it& t&e content of t&e H,,$ reAuest(
recei.es t&e response data fro" t&e application( and returns t&e response to t&e client.
Cro" t&e application9s perspecti.e( t&e runti"e en.iron"ent springs into e/istence 0&en
t&e reAuest &andler 'egins( and disappears 0&en it ends. App Engine pro.ides at least
t0o "et&ods for storing data t&at persists 'et0een reAuests +discussed later-( 'ut t&ese
"ec&anis"s li.e outside of t&e runti"e en.iron"ent. By not retaining state in t&e runti"e
en.iron"ent 'et0een reAuestsTor at least( 'y not e/pecting t&at state 0ill 'e retained
'et0een reAuestsTApp Engine can distri'ute tra2c a"ong as "any ser.ers as it needs to
gi.e e.ery reAuest t&e sa"e treat"ent( regardless of &o0 "uc& tra2c it is &andling at one
ti"e.
Application code cannot access t&e ser.er on 0&ic& it is running in t&e traditional sense. An
application can read its o0n *les fro" t&e filesyste"( 'ut it cannot 0rite to files( and it
cannot read files t&at 'elong to ot&er applications. An application can see en.iron"ent
.aria'les set 'y App Engine( 'ut "anipulations of t&ese .aria'les do not necessarily
persist 'et0een reAuests. An application cannot access t&e net0or1ing facilities of t&e ser.er
&ard0are( t&oug& it can perfor" net0or1ing operations using ser.ices.
In s&ort( eac& reAuest li.es in its o0n :sand'o/.; ,&is allo0s App Engine to &andle a
reAuest 0it& t&e ser.er t&at 0ould( in its esti"ation( pro.ide t&e fastest response. ,&ere is
no 0ay to guarantee t&at t&e sa"e ser.er &ard0are 0ill &andle t0o reAuests( e.en if t&e
reAuests co"e fro" t&e sa"e client and arri.e relati.ely Auic1ly. Sand'o/ing
also allo0s App Engine to run "ultiple applications on t&e sa"e ser.er 0it&out t&e
'e&a.iour of one application a@ecting anot&er.
In addition to li"iting access to t&e operating syste"( t&e runti"e en.iron"ent also li"its t&e
a"ount of cloc1 ti"e( !$L use( and "e"ory a single reAuest can ta1e. App Engine 1eeps
t&ese li"its 3e/i'le( and applies stricter li"its to applications t&at use up "ore resources to
protect s&ared resources fro" :runa0ay; applications. A reAuest &as up to ?0 seconds to
return a response to t&e client. %&ile t&at "ay see" li1e a co"forta'ly large a"ount for a
0e' app( App Engine is opti"i4ed for applications t&at respond in less t&an a second.
Also( if an application uses "any !$L cycles( App Engine "ay slo0 it do0n so t&e app isn9t
&ogging t&e processor on a "ac&ine ser.ing "ultiple apps. A !$Lintensi.e reAuest &andler
"ay ta1e "ore cloc1 ti"e to co"plete.
6he atastore
5ost useful 0e' applications need to store infor"ation during t&e &andling of a reAuest for
retrie.al during a later reAuest. A typical arrange"ent for a s"all 0e'site in.ol.es a single
data'ase ser.er for t&e entire site( and one or "ore 0e' ser.ers t&at connect to t&e
data'ase to store or retrie.e data. Lsing a single central data'ase ser.er "a1es it easy to
&a.e one canonical representation of t&e data( so "ultiple users accessing "ultiple 0e'
ser.ers all see t&e sa"e and "ost recent infor"ation.
But a central ser.er is difficult to scale once it reac&es its capacity for si"ultaneous
connections. By far t&e "ost popular 1ind of data storage syste" for 0e' applications in t&e
past decade &as 'een t&e relational data'ase( 0it& ta'les of ro0s and colu"ns arranged for
space e2ciency and concision( and 0it& inde/es and ra0 co"puting po0er for perfor"ing
Aueries( especially :join; Aueries t&at can treat "ultiple related records as a Auerya'le unit.
7t&er 1inds of data storage syste"s include &ierarc&ical datastores +*le syste"s( P5D
data'ases- and o'ject data'ases. Eac& 1ind of data'ase &as pros and cons( and 0&ic& type
is 'est suited for an application depends on t&e nature of t&e application9s data and &o0 it
is accessed. And eac& 1ind of data'ase &as its o0n tec&niAues for gro0ing past t&e *rst
ser.er.
Foogle App Engine9s data'ase syste" "ost closely rese"'les an o'ject data'ase. It is not
a joinAuery relational data'ase( and if you co"e fro" t&e 0orld of relationaldata'ase
'ac1ed 0e' applications +as I did-( t&is 0ill pro'a'ly reAuire c&anging t&e 0ay you t&in1
a'out your application9s data. As 0it& t&e runti"e en.iron"ent( t&e design of t&e App
Engine datastore is an a'straction t&at allo0s App Engine to &andle t&e details of distri'uting
and scaling t&e application( so your code can focus on ot&er t&ings.
6he -ervices
,&e datastore9s relations&ip 0it& t&e runti"e en.iron"ent is t&at of a ser.ice8 t&e application
uses an A$I to access a separate syste" t&at "anages all of its o0n scaling needs
separately fro" t&e runti"e en.iron"ent. Foogle App Engine includes se.eral ot&er self
scaling ser.ices useful for 0e' applications.
,&e "e"ory cac&e +or "e"cac&e- ser.ice is a s&ortter" 1ey.alue storage ser.ice. Its
"ain ad.antage o.er t&e datastore is t&at it is fast( "uc& faster t&an t&e datastore for
si"ple storage and retrie.al. ,&e "e"cac&e stores .alues in "e"ory instead of on dis1 for
faster access. It is distri'uted li1e t&e datastore( so e.ery reAuest sees t&e sa"e set of 1eys
and .alues. Ho0e.er( it is not persistent li1e t&e datastore8 if a ser.er goes do0n( suc& as
during a po0er failure( "e"ory is erased. It also &as a "ore li"ited sense of ato"icity
and transactionality t&an t&e datastore. As t&e na"e i"plies( t&e "e"cac&e ser.ice is
'est used as a cac&e for t&e results of freAuently perfor"ed Aueries or calculations.
,&e application c&ec1s for a cac&ed .alue( and if t&e .alue isn9t t&ere( it perfor"s t&e Auery or
calculation and stores t&e .alue in t&e cac&e for future use. App Engine applications can
access ot&er 0e' resources using t&e LRD Cetc& ser.ice.
,&e ser.ice "a1es H,,$ reAuests to ot&er ser.ers on t&e Internet( suc& as to retrie.e pages
or interact 0it& 0e' ser.ices. Since re"ote ser.ers can 'e slo0 to respond( t&e LRD Cetc&
A$I supports fetc&ing LRDs in t&e 'ac1ground 0&ile a reAuest &andler does ot&er t&ings( 'ut in
all cases t&e fetc& "ust start and *nis& 0it&in t&e reAuest &andler9s lifeti"e. ,&e application
can also set a deadline( after 0&ic& t&e call is cancelled if t&e re"ote &ost &asn9t responded.
App Engine applications can send "essages using t&e 5ail ser.ice. 5essages can 'e sent on
'e&alf of t&e application or on 'e&alf of t&e user 0&o "ade t&e reAuest t&at is sending t&e
e"ail +if t&e "essage is fro" t&e user-. 5any 0e' applications use e"ail to notify users(
con*r" user actions( and .alidate contact infor"ation.
An application can also recei.e e"ail "essages. If an app is con*gured to recei.e e"ail( a
"essage sent to t&e app9s address is routed to t&e 5ail ser.ice( 0&ic& deli.ers t&e "essage
to t&e app in t&e for" of an H,,$ reAuest to a reAuest &andler.
App Engine applications can send and recei.e instant "essages to and fro" c&at ser.ices
t&at support t&e P5$$ protocol( including Foogle ,al1. An app sends an P5$$ c&at "essage
'y calling t&e P5$$ ser.ice. As 0it& inco"ing e"ail( 0&en so"eone sends a "essage to t&e
app9s address( t&e P5$$ ser.ice deli.ers it to t&e app 'y calling a reAuest &andler.
<)$$
P BataL
%e li.e in t&e data age. ItKs not easy to "easure t&e total .olu"e of data stored
electronically( 'ut an IB! esti"ate put t&e si4e of t&e )digital uni.erse) at 0.1R 4etta'ytes in
200#( and is forecasting a tenfold gro0t& 'y 2011 to 1.R 4etta'ytes._ A 4etta'yte is 10
21
'ytes( or eAui.alently one t&ousand e/a'ytes( one "illion peta'ytes( or one 'illion tera'ytes.
,&atKs roug&ly t&e sa"e order of "agnitude as one dis1 dri.e for e.ery person in t&e 0orld.
,&is flood of data is co"ing fro" "any sources. !onsider t&e follo0ing8
< ,&e Ge0 Mor1 Stoc1 E/c&ange generates a'out one tera'yte of ne0
trade data per day. < Cace'oo1 &osts appro/i"ately 10 'illion p&otos(
ta1ing up one peta'yte of storage.
< Ancest r y.co"( t&e genealogy site( stores around 2.5 peta'ytes of data.
< ,&e Internet Arc&i.e stores around 2 peta'ytes of data( and is gro0ing at a rate of 20
tera'ytes per "ont&. <,&e Darge Hadron !ollider near Fene.a( S0it4erland( 0ill produce
a'out 15 peta'ytes of data per year.
So t&ereKs a lot of data out t&ere. But you are pro'a'ly 0ondering &o0 it a@ects you. 5ost of
t&e data is loc1ed up in t&e largest 0e' properties +li1e searc& engines-( or scientific or
financial institutions( isnKt itE Boes t&e ad.ent of )Big Bata() as it is 'eing called( a@ect s"aller
organi4ations or indi.idualsE
Argua'ly it does. ,a1e p&otos( for e/a"ple. 5y 0ifeKs grandfat&er 0as an a.id p&otograp&er(
and too1 p&otograp&s t&roug&out &is adult life. His entire corpus of "ediu" for"at( slide(
and ?5"" fil"( 0&en scanned in at &ig& resolution( occupies around 10 giga'ytes.
!o"pare t&is to t&e digital p&otos t&at "y fa"ily too1 last year( 0&ic& ta1e up a'out 5
giga'ytes of space. 5y fa"ily is producing p&otograp&ic data at ?5 ti"es t&e rate "y
0ifeKs grandfat&erKs did( and t&e rate is increasing e.ery year as it 'eco"es easier to ta1e
"ore and "ore p&otos.
5ore generally( t&e digital strea"s t&at indi.iduals are producing are gro0ing apace.
5icrosoft Researc&Ks 5yDifeBits project gi.es a gli"pse of arc&i.ing of personal infor"ation
t&at "ay 'eco"e co""onplace in t&e near future. 5yDifeBits 0as an e/peri"ent 0&ere an
indi.idualKs interactionsTp&one calls( e"ails( docu"entsT0ere captured electronically and
stored for later access. ,&e data gat&ered included a p&oto ta1en e.ery "inute( 0&ic&
resulted in an o.erall data .olu"e of one giga'yte a "ont&. %&en storage costs co"e
do0n enoug& to "a1e it feasi'le to store continuous audio and .ideo( t&e data .olu"e for
a future 5yDifeBits ser.ice 0ill 'e "any ti"es t&at.
,&e trend is for e.ery indi.idualKs data footprint to gro0( 'ut per&aps "ore i"portantly
t&e a"ount of data generated 'y "ac&ines 0ill 'e e.en greater t&an t&at generated 'y
people. 5ac&ine logs( RCIB readers( sensor net0or1s( .e&icle F$S traces( retail transactions
Tall of t&ese contri'ute to t&e gro0ing "ountain of data.
,&e .olu"e of data 'eing "ade pu'licly a.aila'le increases e.ery year too. 7rgani4ations no
longer &a.e to "erely "anage t&eir o0n data8 success in t&e future 0ill 'e dictated to a
large e/tent 'y t&eir a'ility to e/tract .alue fro" ot&er organi4ationsK data.
Initiati.es suc& as $u'lic Bata Sets on A " a4on %e ' Ser.ice s( I nf o c&i"p s .o r g( and
t &e i nf o .org e/ist to foster t&e )infor"ation co""ons() 0&ere data can 'e freely +or in t&e
case of A%S( for a "odest price- s&ared for anyone to do0nload and analy4e. 5as&ups
'et0een di@erent infor"ation sources "a1e for une/pected and &it&erto uni"agina'le
applications.
,a1e( for e/a"ple( t&e A s tr o "et r y. n et project( 0&ic& 0atc&es t&e Astro"etry group on Clic1r
for ne0 p&otos of t&e nig&t s1y. It analy4es eac& i"age( and identi*es 0&ic& part of t&e s1y it
is fro"( and any interesting celestial 'odies( suc& as stars or gala/ies. Alt&oug& itKs still a
ne0 and e/peri"ental ser.ice( it s&o0s t&e 1ind of t&ings t&at are possi'le 0&en data +in
t&is case( tagged p&otograp&ic i"ages- is "ade a.aila'le and used for so"et&ing +i"age
analysis- t&at 0as not anticipated 'y t&e creator.
,
It &as 'een said t&at )5ore data usually 'eats 'etter algorit&"s() 0&ic& is to say t&at for
so"e pro'le"s +suc& as reco""ending "o.ies or "usic 'ased on past preferences-( &o0e.er
*endis& your algorit&"s are( t&ey can often 'e 'eaten si"ply 'y &a.ing "ore data +and a less
sop&isticated algorit&"-. ,&e good ne0s is t&at Big Bata is &ere. ,&e 'ad ne0s is t&at 0e are
struggling to store and analy4e it.
ata -torage and )nal(sis
,&e pro'le" is si"ple8 0&ile t&e storage capacities of &ard dri.es &a.e increased "assi.ely
o.er t&e years( access speedsTt&e rate at 0&ic& data can 'e read fro" dri.esT &a.e not
1ept up. 7ne typical dri.e fro" 1SS0 could store 1?Q0 5B of data and &ad a transfer speed of
. 5B/s(i so you could read all t&e data fro" a full dri.e in around *.e "inutes. Al"ost 20
years later one tera'yte dri.es are t&e nor"( 'ut t&e transfer speed is around 100 5B/s( so
it ta1es "ore t&an t0o and a &alf &ours to read all t&e data o@ t&e dis1.
,&is is a long ti"e to read all data on a single dri.eTand 0riting is e.en slo0er. ,&e o'.ious
0ay to reduce t&e ti"e is to read fro" "ultiple dis1s at once. I"agine if 0e &ad 100
dri.es( eac& &olding one &undredt& of t&e data. %or1ing in parallel( 0e could read t&e data
in under t0o "inutes.
7nly using one &undredt& of a dis1 "ay see" 0asteful. But 0e can store one &undred
datasets( eac& of 0&ic& is one tera'yte( and pro.ide s&ared access to t&e". %e can i"agine
t&at t&e users of suc& a syste" 0ould 'e &appy to s&are access in return for s&orter
analysis ti"es( and( statistically( t&at t&eir analysis jo's 0ould 'e li1ely to 'e spread o.er
ti"e( so t&ey 0ouldnKt interfere 0it& eac& ot&er too "uc&.
,&ereKs "ore to 'eing a'le to read and 0rite data in parallel to or fro" "ultiple dis1s( t&oug&.
,&e *rst pro'le" to sol.e is &ard0are failure8 as soon as you start using "any pieces of
&ard0are( t&e c&ance t&at one 0ill fail is fairly &ig&. A co""on 0ay of a.oiding data loss is
t&roug& replication8 redundant copies of t&e data are 1ept 'y t&e syste" so t&at in t&e
e.ent of failure( t&ere is anot&er copy a.aila'le. ,&is is &o0 RAIB 0or1s( for instance(
alt&oug& HadoopKs filesyste"( t&e Hadoop Bistri'uted Cilesyste" +HBCS-( ta1es a
slig&tly di@erent approac&( as you s&all see later.
,&e second pro'le" is t&at "ost analysis tas1s need to 'e a'le to co"'ine t&e data in
so"e 0ay= data read fro" one dis1 "ay need to 'e co"'ined 0it& t&e data fro" any of t&e
ot&er SS dis1s. 6arious distri'uted syste"s allo0 data to 'e co"'ined fro" "ultiple sources(
'ut doing t&is correctly is notoriously c&allenging. 5apReduce pro.ides a progra""ing "odel
t&at a'stracts t&e pro'le" fro" dis1 reads and 0rites( transfor"ing it into a co"putation
o.er sets of 1eys and .alues. ,&e i"portant point for t&e present discussion is t&at t&ere
are t0o parts to t&e co"putation( t&e "ap and t&e reduce( and itKs t&e interface 'et0een t&e
t0o 0&ere t&e )"i/ing) occurs. Di1e HBCS( 5apReduce &as relia'ility 'uiltin.
,&is( in a nuts&ell( is 0&at Hadoop pro.ides8 a relia'le s&ared storage and analysis syste".
,&e storage is pro.ided 'y HBCS( and analysis 'y 5apReduce. ,&ere are ot&er parts to
Hadoop( 'ut t&ese capa'ilities are its 1ernel.
Comparison with $ther -(stems
,&e approac& ta1en 'y 5apReduce "ay see" li1e a 'ruteforce approac&. ,&e pre"ise is t&at
t&e entire datasetTor at least a good portion of itTis processed for eac& Auery. But t&is
is its po0er. 5apReduce is a "atch Auery processor( and t&e a'ility to run an ad &oc Auery
against your 0&ole dataset and get t&e results in a reasona'le ti"e is transfor"ati.e. It
c&anges t&e 0ay you t&in1 a'out data( and unloc1s data t&at 0as pre.iously arc&i.ed on tape
or dis1. It gi.es people t&e opportunity to inno.ate 0it& data. Xuestions t&at too1 too long to
get ans0ered 'efore can no0 'e ans0ered( 0&ic& in turn leads to ne0 Auestions and ne0
insig&ts.
Cor e/a"ple( 5ailtrust( Rac1spaceKs "ail di.ision( used Hadoop for processing e"ail logs.
7ne ad &oc Auery t&ey 0rote 0as to *nd t&e geograp&ic distri'ution of t&eir users. In t&eir
0ords8
,&is data 0as so useful t&at 0eK.e sc&eduled t&e 5apReduce jo' to run "ont&ly and
0e 0ill 'e using t&is data to &elp us decide 0&ic& Rac1space data centers to place ne0 "ail
ser.ers in as 0e gro0Y
By 'ringing se.eral &undred giga'ytes of data toget&er and &a.ing t&e tools to analy4e it(
t&e Rac1space engineers 0ere a'le to gain an understanding of t&e data t&at t&ey ot&er0ise
0ould ne.er &a.e &ad( and( furt&er"ore( t&ey 0ere a'le to use 0&at t&ey &ad learned to
i"pro.e t&e ser.ice for t&eir custo"ers.
(D)MS
%&y canKt 0e use data'ases 0it& lots of dis1s to do largescale 'atc& analysisE %&y is
5apReduce neededE
,&e ans0er to t&ese Auestions co"es fro" anot&er trend in dis1 dri.es8 see1 ti"e is
i"pro.ing "ore slo0ly t&an transfer rate. See1ing is t&e process of "o.ing t&e dis1Ks &ead to
a particular place on t&e dis1 to read or 0rite data. It c&aracteri4es t&e latency of a dis1
operation( 0&ereas t&e transfer rate corresponds to a dis1Ks 'and0idt&.
If t&e data access pattern is do"inated 'y see1s( it 0ill ta1e longer to read or 0rite large
portions of t&e dataset t&an strea"ing t&roug& it( 0&ic& operates at t&e transfer rate. 7n
t&e ot&er &and( for updating a s"all proportion of records in a data'ase( a traditional B,ree
+t&e data structure used in relational data'ases( 0&ic& is li"ited 'y t&e rate it can perfor"
see1s- 0or1s 0ell. Cor updating t&e "ajority of a data'ase( a B,ree is less e2cient t&an
5apReduce( 0&ic& uses Sort/5erge to re'uild t&e data'ase.
In "any 0ays( 5apReduce can 'e seen as a co"ple"ent to an RBB5S. +,&e di@erences
'et0een t&e t0o syste"s are s&o0n in ,a'le 11.- 5apReduce is a good *t for pro'le"s t&at
need to analy4e t&e 0&ole dataset( in a 'atc& fas&ion( particularly for ad &oc analysis. An
RBB5S is good for point Aueries or updates( 0&ere t&e dataset &as 'een inde/ed to deli.er
lo0latency retrie.al and update ti"es of a relati.ely s"all a"ount of data. 5apReduce
suits applications 0&ere t&e data is 0ritten once( and read "any ti"es( 0&ereas a relational
data'ase is good for datasets t&at are continually updated.
Ta"le '('. R)!* compared to *apReduce
6raditional !B,- ,ap!educe
Bata si4e Figa'ytes $eta'ytes
Access Interacti.e and 'atc& Batc&
Lpdates Read and 0rite "any
ti"es
%rite once( read "any
ti"es
Structure Static sc&e"a Byna"ic sc&e"a
Integrity Hig& Do0
Scaling Gonlinear Dinear
Anot&er di@erence 'et0een 5apReduce and an RBB5S is t&e a"ount of structure in t&e
datasets t&at t&ey operate on. tructured data is data t&at is organi4ed into entities t&at
&a.e a de*ned for"at( suc& as P5D docu"ents or data'ase ta'les t&at confor" to a
particular predefined sc&e"a. ,&is is t&e real" of t&e RBB5S. emi(structured data, on t&e
ot&er &and( is looser( and t&oug& t&ere "ay 'e a sc&e"a( it is often ignored( so it "ay 'e
used only as a guide to t&e structure of t&e data8 for e/a"ple( a spreads&eet( in 0&ic& t&e
structure is t&e grid of cells( alt&oug& t&e cells t&e"sel.es "ay &old any for" of data.
+nstructured data does not &a.e any particular internal structure8 for e/a"ple( plain te/t or
i"age data. 5apReduce 0or1s 0ell on unstructured or se"istructured data( since it is
designed to interpret t&e data at processing ti"e. In ot&er 0ords( t&e input 1eys and .alues
for 5apReduce are not an intrinsic property of t&e data( 'ut t&ey are c&osen 'y t&e person
analy4ing t&e data.
Relational data is often normalized to retain its integrity( and re"o.e redundancy.
Gor"ali4ation poses pro'le"s for 5apReduce( since it "a1es reading a record a nonlocal
operation( and one of t&e central assu"ptions t&at 5apReduce "a1es is t&at it is possi'le
to perfor" +&ig&speed- strea"ing reads and 0rites.
A 0e' ser.er log is a good e/a"ple of a set of records t&at is not nor"ali4ed +for e/a"ple(
t&e client &ostna"es are speci*ed in full eac& ti"e( e.en t&oug& t&e sa"e client "ay appear
"any ti"es-( and t&is is one reason t&at log*les of all 1inds are particularly 0ellsuited to
analysis 0it& 5apReduce.
5apReduce is a linearly scala'le progra""ing "odel. ,&e progra""er 0rites t0o functionsT
a "ap function and a reduce functionTeac& of 0&ic& defines a "apping fro" one set of 1ey
.alue pairs to anot&er. ,&ese functions are o'li.ious to t&e si4e of t&e data or t&e cluster t&at
t&ey are operating on( so t&ey can 'e used unc&anged for a s"all dataset and for a "assi.e
one. 5ore i"portantly( if you dou'le t&e si4e of t&e input data( a jo' 0ill run t0ice as slo0.
But if you also dou'le t&e si4e of t&e cluster( a jo' 0ill run as fast as t&e original one. ,&is is
not generally true of SXD Aueries.
7.er ti"e( &o0e.er( t&e di@erences 'et0een relational data'ases and 5apReduce syste"s
are li1ely to 'lur. Bot& as relational data'ases start incorporating so"e of t&e ideas
fro" 5apReduce +suc& as Aster BataKs and Freenplu"Ks data'ases-( and( fro" t&e ot&er
direction( as &ig&erle.el Auery languages 'uilt on 5apReduce +suc& as $ig and Hi.e- "a1e
5apReduce syste"s "ore approac&a'le to traditional data'ase progra""ers._
#rid Computing
,&e Hig& $erfor"ance !o"puting +H$!- and Frid !o"puting co""unities &a.e 'een
doing largescale data processing for years( using suc& A$Is as 5essage $assing Interface
+5$I-. Broadly( t&e approac& in H$! is to distri'ute t&e 0or1 across a cluster of "ac&ines(
0&ic& access a s&ared *lesyste"( &osted 'y a SAG. ,&is 0or1s 0ell for predo"inantly
co"puteintensi.e jo's( 'ut 'eco"es a pro'le" 0&en nodes need to access larger data
.olu"es +&undreds of giga'ytes( t&e point at 0&ic& 5apReduce really starts to s&ine-( since
t&e net0or1 'and0idt& is t&e 'ottlenec1( and co"pute nodes 'eco"e idle.
5apReduce tries to colocate t&e data 0it& t&e co"pute node( so data access is fast since it
is local.
_
,&is feature( 1no0n as data locality, is at t&e &eart of 5apReduce and is t&e reason
for its good perfor"ance. Recogni4ing t&at net0or1 'and0idt& is t&e "ost precious resource
in a data center en.iron"ent +it is easy to saturate net0or1 lin1s 'y copying data around-(
5apReduce i"ple"entations go to great lengt&s to preser.e it 'y e/plicitly "odelling
net0or1 topology. Gotice t&at t&is arrange"ent does not preclude &ig&!$L analyses in
5apReduce.
5$I gi.es great control to t&e progra""er( 'ut reAuires t&at &e or s&e e/plicitly &andle t&e
"ec&anics of t&e data 3o0( e/posed .ia lo0le.el ! routines and constructs( suc& as soc1ets(
as 0ell as t&e &ig&erle.el algorit&" for t&e analysis. 5apReduce operates only at t&e &ig&er
le.el8 t&e progra""er t&in1s in ter"s of functions of 1ey and .alue pairs( and t&e data flo0 is
i"plicit.
!oordinating t&e processes in a largescale distri'uted co"putation is a c&allenge. ,&e
&ardest aspect is gracefully &andling partial failureT0&en you donKt 1no0 if a re"ote process
&as failed or notTand still "a1ing progress 0it& t&e o.erall co"putation. 5apReduce
spares t&e progra""er fro" &a.ing to t&in1 a'out failure( since t&e i"ple"entation
detects failed "ap or reduce tas1s and resc&edules replace"ents on "ac&ines t&at are
&ealt&y. 5apReduce is a'le to do t&is since it is a shared(nothing arc&itecture( "eaning t&at
tas1s &a.e no dependence on one ot&er. +,&is is a slig&t o.ersi"pli*cation( since t&e output
fro" "appers is fed to t&e reducers( 'ut t&is is under t&e control of t&e 5apReduce syste"=
in t&is case( it needs to ta1e "ore care rerunning a failed reducer t&an rerunning a failed
"ap( since it &as to "a1e sure it can retrie.e t&e necessary "ap outputs( and if not(
regenerate t&e" 'y running t&e rele.ant "aps again.- So fro" t&e progra""erKs point of
.ie0( t&e order in 0&ic& t&e tas1s run
doesnKt "atter. By contrast( 5$I progra"s &a.e to e/plicitly "anage t&eir o0n c&ec1
pointing and reco.ery( 0&ic& gi.es "ore control to t&e progra""er( 'ut "a1es t&e" "ore
di2cult to 0rite.
5apReduce "ig&t sound li1e Auite a restricti.e progra""ing "odel( and in a sense it is8 you
are li"ited to 1ey and .alue types t&at are related in specified 0ays( and "appers and
reducers run 0it& .ery li"ited coordination 'et0een one anot&er +t&e "appers pass 1eys and
.alues to reducers-. A natural Auestion to as1 is8 can you do anyt&ing useful or nontri.ial 0it&
itE
,&e ans0er is yes. 5apReduce 0as in.ented 'y engineers at Foogle as a syste" for
'uilding production searc& inde/es 'ecause t&ey found t&e"sel.es sol.ing t&e sa"e
pro'le" o.er and o.er again +and 5apReduce 0as inspired 'y older ideas fro" t&e
functional progra""ing( distri'uted co"puting( and data'ase co""unities-( 'ut it &as since
'een used for "any ot&er applications in "any ot&er industries. It is pleasantly surprising to
see t&e range of algorit&"s t&at can 'e e/pressed in 5apReduce( fro" i"age analysis( to
grap&'ased pro'le"s( to "ac&ine learning algorit&"s.)) It canKt sol.e e.ery pro'le"( of
course( 'ut it is a general dataprocessing tool.
>olunteer Computing
%&en people *rst &ear a'out Hadoop and 5apReduce( t&ey often as1( )Ho0 is it di@erent
fro" SE,If&o"eE) SE,I( t&e Searc& for E/tra,errestrial Intelligence( runs a project called
SE,If& o "e in 0&ic& .olunteers donate !$L ti"e fro" t&eir ot&er0ise idle co"puters to
analy4e radio telescope data for signs of intelligent life outside eart&. SE,If&o"e is t&e
"ost 0ell1no0n of "any volunteer computing projects= ot&ers include t&e Freat Internet
5ersenne $ri"e Searc& +to searc& for large pri"e nu"'ers- and Coldingf&o"e +to
understand protein folding( and &o0 it relates to disease-.
6olunteer co"puting projects 0or1 'y 'rea1ing t&e pro'le" t&ey are trying to sol.e into
c&un1s called wor, units, 0&ic& are sent to co"puters around t&e 0orld to 'e analy4ed. Cor
e/a"ple( a SE,If&o"e 0or1 unit is a'out 0.?5 5B of radio telescope data( and ta1es &ours
or days to analy4e on a typical &o"e co"puter. %&en t&e analysis is co"pleted( t&e results
are sent 'ac1 to t&e ser.er( and t&e client gets anot&er 0or1 unit. As a precaution to co"'at
c&eating( eac& 0or1 unit is sent to t&ree di@erent "ac&ines( and needs at least t0o results to
agree to 'e accepted.
Alt&oug& SE,If&o"e "ay 'e super*cially si"ilar to 5apReduce +'rea1ing a pro'le" into
independent pieces to 'e 0or1ed on in parallel-( t&ere are so"e signi*cant differences. ,&e
SE,If&o"e pro'le" is .ery !$Lintensi.e( 0&ic& "a1es it suita'le for running on &undreds
of t&ousands of co"puters across t&e 0orld(
1
) since t&e ti"e to transfer t&e 0or1 unit is
d0arfed 'y t&e ti"e to run t&e co"putation on it. 6olunteers are donating !$L cycles(
not 'and0idt&.
5apReduce is designed to run jo's t&at last "inutes or &ours on trusted( dedicated
&ard0are running in a single data center 0it& .ery &ig& aggregate 'and0idt&
interconnects. By contrast( SE,If&o"e runs a perpetual co"putation on untrusted
"ac&ines on t&e Internet 0it& &ig&ly .aria'le connection speeds and no data locality.
) Brief <istor( of <adoop
Hadoop 0as created 'y Boug !utting( t&e creator of Apac&e Ducene( t&e 0idely used te/t
searc& li'rary. Hadoop &as its origins in Apac&e Gutc&( an open source 0e' searc& engine(
itself a part of t&e Ducene project.
*he #rigin of the Name +,adoop+
,&e na"e Hadoop is not an acrony"= itKs a "adeup na"e. ,&e projectKs creator( Boug
!utting( e/plains &o0 t&e na"e ca"e a'out8
The name my ,id gave a stu-ed yellow elephant. hort, relatively easy to spell and
pronounce, meaningless, and not used elsewhere& those are my naming criteria. .ids are good
at generating such. /oogol is a ,id0s term.
Su'projects and )contri') "odules in Hadoop also tend to &a.e na"es t&at are unrelated to
t&eir function( often 0it& an elep&ant or ot&er ani"al t&e"e +)$ig() for e/a"ple-. S"aller
co"ponents are gi.en "ore descripti.e +and
t&erefore "ore "undane- na"es. ,&is is a good principle( as it "eans you can
generally 0or1 out 0&at so"et&ing does fro" its na"e. Cor e/a"ple( t&e jo'trac1er5 1eeps
trac1 of 5apReduce jo's.
Building a 0e' searc& engine fro" scratc& 0as an a"'itious goal( for not only is t&e soft0are
reAuired to cra0l and inde/ 0e'sites co"ple/ to 0rite( 'ut it is also a c&allenge to run
0it&out a dedicated operations tea"( since t&ere are so "any "o.ing parts. ItKs e/pensi.e
too8 5i1e !afarella and Boug !utting esti"ated a syste" supporting a 1 'illionpage inde/
0ould cost around &alf a "illion dollars in &ard0are( 0it& a "ont&ly running cost of
j?0(000.) Ge.ert&eless( t&ey 'elie.ed it 0as a 0ort&y goal( as it 0ould open up and
ulti"ately de"ocrati4e searc& engine algorit&"s.
Gutc& 0as started in 2002( and a 0or1ing cra0ler and searc& syste" Auic1ly e"erged.
Ho0e.er( t&ey reali4ed t&at t&eir arc&itecture 0ouldnKt scale to t&e 'illions of pages on t&e
%e'. Help 0as at &and 0it& t&e pu'lication of a paper in 200? t&at descri'ed t&e
arc&itecture of FoogleKs distri'uted *lesyste"( called FCS( 0&ic& 0as 'eing used in production
at Foogle.e FCS( or so"et&ing li1e it( 0ould sol.e t&eir storage needs for t&e .ery large *les
generated as a part of t&e 0e' cra0l and inde/ing process. In particular( FCS 0ould free up
ti"e 'eing spent on ad"inistrati.e tas1s suc& as "anaging storage nodes. In 200( t&ey set
a'out 0riting an open source i"ple"entation( t&e Gutc& Bistri'uted Cilesyste" +GBCS-.
In 200( Foogle pu'lis&ed t&e paper t&at introduced 5apReduce to t&e 0orld._ Early in 2005(
t&e Gutc& de.elopers &ad a 0or1ing 5apReduce i"ple"entation in Gutc&( and 'y t&e "iddle
of t&at year all t&e "ajor Gutc& algorit&"s &ad 'een ported to run using 5apReduce and
GBCS.
GBCS and t&e 5apReduce i"ple"entation in Gutc& 0ere applica'le 'eyond t&e real" of
searc&( and in Ce'ruary 200# t&ey "o.ed out of Gutc& to for" an independent su'project
of Ducene called Hadoop. At around t&e sa"e ti"e( Boug !utting joined Ma&ooN( 0&ic&
pro.ided a dedicated tea" and t&e resources to turn Hadoop into a syste" t&at ran at 0e'
scale +see side'ar-. ,&is 0as de"onstrated in Ce'ruary 200R 0&en Ma&ooN announced
t&at its production searc& inde/ 0as 'eing generated 'y a 10(000core Hadoop cluster.
f
In Yanuary 200R( Hadoop 0as "ade its o0n tople.el project at Apac&e( confir"ing its success
and its di.erse( acti.e co""unity. By t&is ti"e Hadoop 0as 'eing used 'y "any ot&er
co"panies 'esides Ma&ooN( suc& as Dast.f"( Cace'oo1.
In one 0ellpu'lici4ed feat( t&e 1ew 2or, Times used A"a4onKs E!2 co"pute cloud to crunc&
t&roug& four tera'ytes of scanned arc&i.es fro" t&e paper con.erting t&e" to $BCs for t&e
%e'.t ,&e processing too1 less t&an 2 &ours to run using 100 "ac&ines( and t&e project
pro'a'ly 0ouldnKt &a.e 'een e"'ar1ed on 0it&out t&e co"'ination of A"a4onKs pay'yt&e
&our "odel +0&ic& allo0ed t&e GM, to access a large nu"'er of "ac&ines for a s&ort period-(
and HadoopKs easytouse parallel progra""ing "odel.
In April 200R( Hadoop 'ro1e a 0orld record to 'eco"e t&e fastest syste" to sort a tera'yte
of data. Running on a S10node cluster( Hadoop sorted one tera'yte in 20S seconds +just
under ?b "inutes-( 'eating t&e pre.ious yearKs 0inner of 2SQ seconds +descri'ed in detail in
),eraByte Sort on Apac&e Hadoop) on page #1-. In Go.e"'er of t&e sa"e year( Foogle
reported t&at its 5apReduce i"ple"entation sorted one tera'yte in #R seconds.
<adoop at KahooL
Building Internetscale searc& engines reAuires &uge a"ounts of data and t&erefore large
nu"'ers of "ac&ines to process it. Ma&ooN Searc& consists of four pri"ary co"ponents8
t&e Crawler, 0&ic& do0nloads pages fro" 0e' ser.ers= t&e 3e"*ap, 0&ic& 'uilds a grap& of
t&e 1no0n %e'= t&e Indexer, 0&ic& 'uilds a re.erse inde/ to t&e 'est pages= and t&e
Runtime, 0&ic& ans0ers usersK Aueries. ,&e %e'5ap is a grap& t&at consists of roug&ly 1
trillion +10
12
- edges eac& representing a 0e' lin1 and 100 'illion +10
11
- nodes eac&
representing distinct LRDs. !reating and analy4ing suc& a large grap& reAuires a large nu"'er
of co"puters running for "any days.
In early 2005( t&e infrastructure for t&e %e'5ap( na"ed )readnaught, needed to 'e
redesigned to scale up to "ore nodes. Breadnaug&t &ad successfully scaled fro" 20 to #00
nodes( 'ut reAuired a co"plete redesign to scale up furt&er. Breadnaug&t is si"ilar to
5apReduce in "any 0ays( 'ut pro.ides "ore 3e/i'ility and less structure. In particular(
eac& frag"ent in a Breadnaug&t jo' can send output to eac& of t&e frag"ents in t&e ne/t
stage of t&e jo'( 'ut t&e sort 0as all done in li'rary code. In practice( "ost of t&e %e'5ap
p&ases 0ere pairs t&at corresponded to 5apReduce. ,&erefore( t&e %e'5ap applications
0ould not reAuire e/tensi.e refactoring to *t into 5apReduce.
Eric Baldesc&0ieler +Eric1- created a s"all tea" and 0e starting designing and
prototyping a ne0 fra"e0or1 0ritten in !^^ "odeled after FCS and 5apReduce to replace
Breadnaug&t. Alt&oug& t&e i""ediate need 0as for a ne0 fra"e0or1 for %e'5ap( it 0as
clear t&at standardi4ation of t&e 'atc& platfor" across Ma&ooN Searc& 0as critical and 'y
"a1ing t&e fra"e0or1 general enoug& to support ot&er users( 0e could 'etter le.erage
in.est"ent in t&e ne0 platfor".
At t&e sa"e ti"e( 0e 0ere 0atc&ing Hadoop( 0&ic& 0as part of Gutc&( and its progress. In
Yanuary 200#( Ma&ooN &ired Boug !utting( and a "ont& later 0e decided to a'andon our
prototype and adopt Hadoop. ,&e ad.antage of Hadoop o.er our prototype and design 0as
t&at it 0as already 0or1ing 0it& a real application +Gutc&- on 20 nodes. ,&at allo0ed us to
'ring up a researc& cluster t0o "ont&s later and start &elping real custo"ers use t&e
ne0 fra"e0or1 "uc& sooner t&an 0e could &a.e ot&er0ise. Anot&er ad.antage( of course(
0as t&at since Hadoop 0as already open source( it 0as easier +alt&oug& far fro" easyN- to
get per"ission fro" Ma&ooNKs legal depart"ent to 0or1 in open source. So 0e set up a
200node cluster for t&e researc&ers in early 200# and put t&e %e'5ap con.ersion plans
on &old 0&ile 0e supported and i"pro.ed Hadoop for t&e researc& users.
6he )pache <adoop Pro4ect
,oday( Hadoop is a collection of related su'projects t&at fall under t&e u"'rella of
infrastructure for distri'uted co"puting. ,&ese projects are &osted 'y t&e Ap a c&e Sof t 0are
Coun d atio n ( 0&ic& pro.ides support for a co""unity of open source soft0are projects.
Alt&oug& Hadoop is 'est 1no0n for 5apReduce and its distri'uted *lesyste" +HBCS(
rena"ed fro" GBCS-( t&e ot&er su'projects pro.ide co"ple"entary ser.ices( or 'uild on
t&e core to add &ig&erle.el a'stractions. ,&e su'projects( and 0&ere t&ey sit in t&e
tec&nology stac1( are s&o0n in Cigure 11 and descri'ed 'rie3y &ere8
Core
4 set of components and interfaces for distri"uted filesystems and general I/% 5serialization,
6ava R7C, persistent data structures8.
)vro
4 data serialization system for e-icient, cross(language R7C, and persistent data storage. 54t
the time of this writing, 4vro had "een created only as a new su"pro(#ect, and no other 9adoop
su"pro#ects were using it yet.8
,ap!educe
4 distri"uted data processing model and execution environment that runs on large clusters
of commodity machines. <F-
4 distri"uted filesystem that runs on large clusters of
commodity machines. Pig
4 data flow language and execution environment for exploring very large datasets.
7ig runs on 9): and *apReduce clusters.
<Base
4 distri"uted, column(oriented data"ase. 9!ase uses 9): for its underlying storage,
and supports "oth "atch( style computations using *apReduce and point ;ueries
5random reads8.
Moo0eeper
4 distri"uted, highly availa"le coordination service. <oo.eeper provides primitives
such as distri"uted loc,s that can "e used for "uilding distri"uted applications.
<ive
,
4 distri"uted data warehouse. 9ive manages data stored in 9): and provides a ;uery
language "ased on => 5and which is translated "y the runtime engine to *apReduce
#o"s8 for ;uerying the data.
Chu1wa
$age
Q1
<F-
%&en a dataset outgro0s t&e storage capacity of a single p&ysical "ac&ine( it 'eco"es
necessary to partition it across a nu"'er of separate "ac&ines. Cilesyste"s t&at "anage
t&e storage across a net0or1 of "ac&ines are called distri"uted ?lesystems. Since t&ey are
net0or1'ased( all t&e co"plications of net0or1 progra""ing 1ic1 in( t&us "a1ing
distri'uted *lesyste"s "ore co"ple/ t&an regular dis1 *lesyste"s. Cor e/a"ple( one of
t&e 'iggest c&allenges is "a1ing t&e *lesyste" tolerate node failure 0it&out su@ering data
loss.
Hadoop co"es 0it& a distri'uted filesyste" called HBCS( 0&ic& stands for 9adoop )istri"uted
:ilesystem. +Mou "ay so"eti"es see references to )BCS)Tinfor"ally or in older
docu"entation or configurationT0&ic& is t&e sa"e t&ing.- HBCS is HadoopKs flags&ip
filesyste" and is t&e focus of t&is c&apter( 'ut Hadoop actually &as a general purpose
*lesyste" a'straction( so 0eKll see along t&e 0ay &o0 Hadoop integrates 0it& ot&er storage
syste"s +suc& as t&e local *lesyste" and A"a4on S?-.
6he esign of <F-
HBCS is a *lesyste" designed for storing .ery large *les 0it& strea"ing data access patterns(
running on clusters on co""odity &ard0are. DetKs e/a"ine t&is state"ent in "ore detail8
@ery large files
)6ery large) in t&is conte/t "eans *les t&at are &undreds of "ega'ytes( giga'ytes( or
tera'ytes in si4e. ,&ere are Hadoop clusters running today t&at store peta 'ytes of data._
treaming data access
HBCS is 'uilt around t&e idea t&at t&e "ost e2cient data processing pattern is a 0rite
once( read"anyti"es pattern. A dataset is typically generated or copied fro" source( t&en
.arious analyses are perfor"ed on t&at dataset o.er ti"e. Eac& analysis 0ill in.ol.e a large
proportion( if not all( of t&e dataset( so t&e ti"e to read t&e 0&ole dataset is "ore
i"portant t&an t&e latency in reading t&e *rst record.
Commodity hardware
Hadoop doesnKt reAuire e/pensi.e( &ig&ly relia'le &ard0are to run on. ItKs designed to run on
clusters of co""odity &ard0are +co""only a.aila'le &ard0are a.aila'le fro" "ultiple
.endorsk- for 0&ic& t&e c&ance of node failure across t&e cluster is &ig&( at least for large
clusters. HBCS is designed to carry on 0or1ing 0it&out a noticea'le interruption to t&e user
in t&e face of suc& failure. It is also 0ort& e/a"ining t&e applications for 0&ic& using HBCS
does not 0or1 so 0ell. %&ile t&is "ay c&ange in t&e future( t&ese are areas 0&ere HBCS is not
a good fit today8
>ow(latency data access
Applications t&at reAuire lo0latency access to data( in t&e tens of "illiseconds range( 0ill not
0or1 0ell 0it& HBCS. Re"e"'er HBCS is opti"i4ed for deli.ering a &ig& t&roug&put of data(
and t&is "ay 'e at t&e e/pense of latency. HBase is currently a 'etter c&oice for lo0latency
access.
>ots of small ?les
Since t&e na"enode &olds filesyste" "etadata in "e"ory( t&e li"it to t&e nu"'er of
*les in a *lesyste" is go.erned 'y t&e a"ount of "e"ory on t&e na"enode. As a rule of
t&u"'( eac& *le( directory( and 'loc1 ta1es
a'out 150 'ytes. So( for e/a"ple( if you &ad one "illion *les( eac& ta1ing one 'loc1( you
0ould need at least ?00 5B of "e"ory. %&ile storing "illions of *les is feasi'le( 'illions is
'eyond t&e capa'ility of current &ard0are.
*ultiple writers, ar"itrary ?le modi?cations
Ciles in HBCS "ay 'e 0ritten to 'y a single 0riter. %rites are al0ays "ade at t&e end
of t&e file. ,&ere is no support for "ultiple 0riters( or for "odi*cations at ar'itrary
o@sets in t&e *le. +,&ese "ig&t 'e supported in t&e future( 'ut t&ey are li1ely to 'e
relati.ely ine2cient.-
<F- Concepts Bloc1s
A dis1 &as a 'loc1 si4e( 0&ic& is t&e "ini"u" a"ount of data t&at it can read or
0rite. Cilesyste"s for a single dis1 'uild on t&is 'y dealing 0it& data in 'loc1s( 0&ic&
are an integral "ultiple of t&e dis1 'loc1 si4e. Cilesyste" 'loc1s are typically a fe0
1ilo'ytes in si4e( 0&ile dis1 'loc1s are nor"ally 512 'ytes. ,&is is generally
transparent to t&e *lesyste" user 0&o is si"ply reading or 0riting a fileTof
0&ate.er lengt&. Ho0e.er( t&ere are tools to do 0it& *lesyste" "aintenance( suc&
as df and fsc,, t&at operate on t&e *lesyste" 'loc1 le.el.
HBCS too &as t&e concept of a 'loc1( 'ut it is a "uc& larger unitT# 5B 'y default.
Di1e in a *lesyste" for a single dis1( *les in HBCS are 'ro1en into 'loc1si4ed c&un1s(
0&ic& are stored as independent units. Lnli1e a *les yste" for a single dis1( a *le in
HBCS t&at is s"aller t&an a single 'loc1 does not occupy a full 'loc1Ks 0ort& of
underlying storage. %&en unAuali*ed( t&e ter" )'loc1) in t&is 'oo1 refers to a 'loc1 in
HBCS.
:h( Is a Bloc1 in <F- -o 'arge=
HBCS 'loc1s are large co"pared to dis1 'loc1s( and t&e reason is to "ini"i4e t&e cost
of see1s. By "a1ing a 'loc1 large enoug&( t&e ti"e to transfer t&e data fro" t&e dis1
can 'e "ade to 'e signi*cantly larger t&an t&e ti"e to see1 to t&e start of t&e 'loc1.
,&us t&e ti"e to transfer a large *le "ade of "ultiple 'loc1s operates at t&e dis1
transfer rate.
A Auic1 calculation s&o0s t&at if t&e see1 ti"e is around 10"s( and t&e transfer rate
is 100 5B/s( t&en to "a1e t&e see1 ti"e 1b of t&e transfer ti"e( 0e need to "a1e t&e
'loc1 si4e around 100 5B. ,&e default is actually # 5B( alt&oug& "any HBCS
installations use 12R 5B 'loc1s. ,&is figure 0ill continue to 'e re.ised up0ard as
transfer speeds gro0 0it& ne0 generations of dis1 dri.es.
,&is argu"ent s&ouldnKt 'e ta1en too far( &o0e.er. 5ap tas1s in 5apReduce nor"ally
operate on one 'loc1 at a ti"e( so if you &a.e too fe0 tas1s +fe0er t&an nodes in t&e
cluster-( your jo's 0ill run slo0er t&an t&ey could ot&er0ise.
Ha.ing a 'loc1 a'straction for a distri'uted filesyste" 'rings se.eral 'enefits. ,&e
*rst 'ene*t is t&e "ost o'.ious8 a *le can 'e larger t&an any single dis1 in t&e
net0or1. ,&ereKs not&ing t&at reAuires t&e 'loc1s fro" a *le to 'e stored on t&e sa"e
dis1( so t&ey can ta1e ad.antage of any of t&e dis1s in t&e cluster. In fact( it 0ould 'e
possi'le( if unusual( to store a single *le on an HBCS cluster 0&ose 'loc1s *lled all t&e
dis1s in t&e cluster.
Second( "a1ing t&e unit of a'straction a 'loc1 rat&er t&an a *le si"pli*es t&e storage
su'syste". Si"plicity is so"et&ing to stri.e for all in all syste"s( 'ut is i"portant for
a distri'uted syste" in 0&ic& t&e failure "odes are so .aried. ,&e storage
su'syste" deals 0it& 'loc1s( si"plifying storage "anage"ent +since 'loc1s are a
*/ed si4e( it is easy to calculate &o0 "any can 'e stored on a gi.en dis1-( and
eli"inating
"etadata concerns +'loc1s are just a c&un1 of data to 'e storedTfile "etadata
suc& as per"issions infor"ation does not need to 'e stored 0it& t&e 'loc1s( so
anot&er syste" can &andle "etadata ort&ogonally-.
Curt&er"ore( 'loc1s *t 0ell 0it& replication for pro.iding fault tolerance and
a.aila'ility. ,o insure against corrupted 'loc1s and dis1 and "ac&ine failure( eac&
'loc1 is replicated to a s"all nu"'er of p&ysically separate "ac&ines +typically
t&ree-. If a 'loc1 'eco"es una.aila'le( a copy can 'e read fro" anot&er location
in a 0ay t&at is transparent to t&e client. A 'loc1 t&at is no longer a.aila'le due to
corruption or "ac&ine failure can 'e replicated fro" t&eir alternati.e locations to
ot&er li.e "ac&ines to 'ring t&e replication factor 'ac1 to t&e nor"al le.el. +See
)Bata Integrity) on page Q5 for "ore on guarding against corrupt data.- Si"ilarly(
so"e applications "ay c&oose to set a &ig& replication factor for t&e 'loc1s in a
popular *le to spread t&e read load on t&e cluster.
Di1e its dis1 filesyste" cousin( HBCSKs fsc1 co""and understands 'loc1s. Cor e/a"ple(
running8
N hadoop fsc1 files bloc1s
0ill list t&e 'loc1s t&at "a1e up eac& *le in t&e *lesyste".
+amenodes and atanodes
A HBCS cluster &as t0o types of node operating in a "aster0or1er pattern8 a name(
node +t&e "aster- and a nu"'er of datanodes +0or1ers-. ,&e na"enode "anages
t&e *lesyste" na"espace. It "aintains t&e *lesyste" tree and t&e "etadata for all
t&e *les and directories in t&e tree. ,&is infor"ation is stored persistently on t&e
local dis1 in t&e for" of t0o *les8 t&e na"espace i"age and t&e edit log. ,&e
na"enode also 1no0s t&e datanodes on 0&ic& all t&e 'loc1s for a gi.en *le are
located( &o0e.er( it does not store 'loc1 locations persistently( since t&is infor"ation is
reconstructed fro" datanodes 0&en t&e syste" starts.
A client accesses t&e filesyste" on 'e&alf of t&e user 'y co""unicating 0it& t&e na"e
node and datanodes. ,&e client presents a $7SIPli1e *lesyste" interface( so t&e
user code does not need to 1no0 a'out t&e na"enode and datanode to function.
Batanodes are t&e 0or1 &orses of t&e *lesyste". ,&ey store and retrie.e 'loc1s 0&en
t&ey are told to +'y clients or t&e na"enode-( and t&ey report 'ac1 to t&e na"enode
periodically 0it& lists of 'loc1s t&at t&ey are storing.
%it&out t&e na"enode( t&e *lesyste" cannot 'e used. In fact( if t&e "ac&ine running
t&e na"enode 0ere o'literated( all t&e *les on t&e *lesyste" 0ould 'e lost since
t&ere 0ould 'e no 0ay of 1no0ing &o0 to reconstruct t&e *les fro" t&e 'loc1s on
t&e datanodes. Cor t&is reason( it is i"portant to "a1e t&e na"enode resilient to
failure( and Hadoop pro.ides t0o "ec&anis"s for t&is.
,&e *rst 0ay is to 'ac1 up t&e *les t&at "a1e up t&e persistent state of t&e
*lesyste" "etadata. Hadoop can 'e con*gured so t&at t&e na"enode 0rites its
persistent state to "ultiple *lesyste"s. ,&ese 0rites are sync&ronous and ato"ic. ,&e
usual con*guration c&oice is to 0rite to local dis1 as 0ell as a re"ote GCS "ount.
It is also possi'le to run a secondary namenode, 0&ic& despite its na"e does not act as a
na"enode. Its "ain role is to periodically "erge t&e na"espace i"age 0it& t&e edit log to
pre.ent t&e edit log fro" 'eco"ing too large. ,&e secondary na"enode usually runs on a
separate p&ysical "ac&ine( since it reAuires plenty of !$L and as "uc& "e"ory as t&e
na"enode to perfor" t&e "erge. It 1eeps a copy of t&e "erged na"espace i"age( 0&ic&
can 'e used in t&e e.ent of t&e na"enode failing. Ho0e.er( t&e state of t&e secondary
na"enode lags t&at of t&e pri"ary( so in t&e e.ent of total failure of t&e pri"ary data( loss is
al"ost guaranteed. ,&e usual course of action in t&is case is to copy t&e na"enodeKs
"etadata *les t&at are on GCS to t&e secondary and run it as t&e ne0 pri"ary.
6he Command-'ine Interface
%eKre going to &a.e a loo1 at HBCS 'y interacting 0it& it fro" t&e co""and line. ,&ere are
"any ot&er interfaces to HBCS( 'ut t&e co""and line is one of t&e si"plest( and to "any
de.elopers t&e "ost fa"iliar.
%e are going to run HBCS on one "ac&ine( so *rst follo0 t&e instructions for setting up
Hadoop in pseudo distri'uted "ode in Appendi/ A. Dater youKll see &o0 to run on a cluster
of "ac&ines to gi.e us scala'ility and fault tolerance.
,&ere are t0o properties t&at 0e set in t&e pseudodistri'uted con*guration t&at deser.e
furt&er e/planation. ,&e *rst is fs.default.na"e( set to hdfs&//localhost/, 0&ic& is used to set a
default *lesyste" for Hadoop. Cilesyste"s are speci*ed 'y a LRI( and &ere 0e &a.e used a
&dfs LRI to configure Hadoop to use HBCS 'y default. ,&e HBCS dae"ons 0ill use t&is
property to deter"ine t&e &ost and port for t&e HBCS na"enode. %eKll 'e running it on
local&ost( on t&e default HBCS port( R020. And HBCS clients 0ill use t&is property to 0or1 out
0&ere t&e na"enode is running so t&ey can connect to it.
%e set t&e second property( dfs.replication( to one so t&at HBCS doesnKt replicate
*lesyste" 'loc1s 'y t&e usual default of t&ree. %&en running 0it& a single datanode( HBCS
canKt replicate 'loc1s to t&ree datanodes( so it 0ould perpetually 0arn a'out 'loc1s 'eing
underreplicated. ,&is setting sol.es t&at pro'le".
Basic Files(stem $perations
,&e filesyste" is ready to 'e used( and 0e can do all of t&e usual *lesyste" operations suc&
as reading *les( creating directories( "o.ing *les( deleting data( and listing directories. Mou
can type &adoop fs &elp to get detailed &elp on e.ery co""and. Start 'y copying a *le fro"
t&e local filesyste" to HBCS8
N hadoop fs cop(From'ocal input3docs32uangle.t5t
hdfs"33localhost3user3tom32uangle.t5t
,&is co""and in.o1es HadoopKs *lesyste" s&ell co""and fs( 0&ic& supports a nu"'er of
su'co""andsTin t&is case( 0e are running copyCro"Docal. ,&e local file ;uangle.txt is
copied to t&e *le /user/tom/;uangle.txt on t&e HBCS instance running on local&ost. In fact(
0e could &a.e o"itted t&e sc&e"e and &ost of t&e LRI and pic1ed up t&e default(
&dfs8//local&ost( as speci*ed in core(site.xml.
b hadoop fs cop(From'ocal input3docs32uangle.t5t 3user3tom32uangle.t5t
%e could also &a.e used a relati.e pat&( and copied t&e *le to our &o"e directory in HBCS(
0&ic& in t&is case is /user/tom&
b hadoop fs cop(From'ocal input3docs32uangle.t5t 2uangle.t5t
DetKs copy t&e *le 'ac1 to t&e local *lesyste" and c&ec1 0&et&er itKs t&e sa"e8
b hadoop fs cop(6o'ocal 2uangle.t5t 2uangle.cop(.t5t b mdO
input3docs32uangle.t5t 2uangle.cop(.t5t
$age
Q#
5B5 +input/docs/Auangle.t/t- V
al#f2?lda#'05e2'aQa??S?20eQdacdS 5B5 +Auangle.copy.t/t-
V a1#f2?1da#'05e2'aQa??S?20eQdacdS
,&e 5B5 digests are t&e sa"e( s&o0ing t&at t&e *le sur.i.ed its trip to HBCS and is 'ac1
intact. Cinally( letKs loo1 at an HBCS file listing. %e create a directory first just to see &o0 it is
displayed in t&e listing8
b hadoop fs m1dir boo1s b hadoop fs ls .
Cound 2 ite"s
dr0/r/r/ to" supergroup 0 200S002 2281 /user/to"/'oo1s
r0rr 1 to" supergroup 11R 200S002 2282S /user/to"/Auangle.t/t
,&e infor"ation returned is .ery si"ilar to t&e Lni/ co""and ls l( 0it& a fe0 "inor
differences. ,&e *rst colu"n s&o0s t&e *le "ode. ,&e second colu"n is t&e replication factor
of t&e *le +so"et&ing a traditional Lni/ filesyste"s does not &a.e-. Re"e"'er 0e set t&e
default replication factor in t&e site0ide configuration to 'e 1( 0&ic& is 0&y 0e see t&e sa"e
.alue &ere. ,&e entry in t&is colu"n is e"pty for directories since t&e concept of replication
does not apply to t&e"Tdirectories are treated as "etadata and stored 'y t&e na"enode(
not t&e datanodes. ,&e t&ird and fourt& colu"ns s&o0 t&e *le o0ner and group. ,&e fift&
colu"n is t&e si4e of t&e *le in 'ytes( or 4ero for direc tories. ,&e si/ and se.ent& colu"ns
are t&e last "odi*ed date and ti"e. Cinally( t&e eig&t& colu"n is t&e a'solute na"e of t&e
file or directory.
File Permissions in <F-
HBCS &as a per"issions "odel for *les and directories t&at is "uc& li1e $7SIP. ,&ere are
t&ree types of per"ission8 t&e read per"ission +r-( t&e 0rite per"ission +0- and t&e e/ecute
per"ission +/-. ,&e read per"ission is reAuired to read *les or list t&e contents of a directory.
,&e 0rite per"ission is reAuired to 0rite a *le( or for a directory( to create or delete *les or
directories in it.
,&e e/ecute per"ission is ignored for a *le since you canKt e/ecute a *le on HBCS +unli1e
$7SIP-( and for a directory it is reAuired to access its c&ildren.
Eac& *le and directory &as an owner, a group, and a mode. ,&e "ode is "ade up of t&e
per"issions for t&e user 0&o is t&e o0ner( t&e per"issions for t&e users 0&o are "e"'ers of
t&e group( and t&e per"issions for users 0&o are neit&er t&e o0ner nor "e"'ers of t&e
group.
A clientKs identity is deter"ined 'y t&e userna"e and groups of t&e process it is running
in. Because clients are re"ote( t&is "a1es it possi'le to 'eco"e an ar'itrary user( si"ply
'y creating an account of t&at na"e on t&e re"ote syste".
,&us( per"issions s&ould 'e used only in a cooperati.e co""unity of users( as a "ec&anis"
for s&aring *lesyste" resources and for a.oiding accidental data loss( and not for securing
resources in a &ostile en.iron"ent. Ho0e.er( despite t&ese dra0'ac1s( it is 0ort&0&ile
&a.ing per"issions ena'led +as it is 'y default= see t&e dfs.per"issions property-( to a.oid
accidental "odification or deletion of su'stantial parts of t&e *lesyste"( eit&er 'y users or
'y auto"ated tools or progra"s.
%&en per"issions c&ec1ing is ena'led( t&e o0ner per"issions are c&ec1ed if t&e clientKs
userna"e "atc&es t&e o0ner( and t&e group per"issions are c&ec1ed if t&e client is a
"e"'er of t&e group= ot&er0ise( t&e ot&er per"issions are c&ec1ed.
,&ere is a concept of a superuser( 0&ic& is t&e identity of t&e na"enode process.
$er"issions c&ec1s are not perfor"ed for t&e superuser.
<adoop Files(stems
Hadoop &as an a'stract notion of *lesyste"( of 0&ic& HBCS is just one i"ple"entation.
,&e Ya.a a'stract class org.apac&e.&adoop.fs. CileSyste" represents a *lesyste" in
Hadoop( and t&ere are se.eral concrete i"ple"entations( 0&ic& are descri'ed 'elo08
Files(stem
7!I
sche
me
Fava implementation .all
escription under
org.apache.hadoop/
Docal file
HBCS
hdfs
HC,$
hftp
HSC,$
hsftp
HAR har
fs.DocalCileSyste"
&dfs.Bistri'utedCileS
yste"
&dfs.HftpCileSyste"
&dfs.HsftpCileSyste"
fs.HarCileSyste"
A filesyste" for a locally connected dis1 0it&
clientside c&ec1 su"s. Lse Ra0DocalCileSys te"
for a local filesyste" 0it& no c&ec1su"s. See
)DocalCileSyste") on page Q#.
HadoopKs distri'uted filesyste". HBCS is designed
to 0or1 e@i ciently in conjunction 0it& 5apReduce.
A filesyste" pro.iding readonly access to HBCS
o.er H,,$. +Bespite its na"e( HC,$ &as no
connection 0it& C,$.- 7ften used 0it& distcp
+)$arallel !opying 0it& distcp) on page Q0- to
copy data 'et0een HBCS clusters running di@erent
.ersions.
A filesyste" pro.iding readonly access to HBCS
o.er H,,$S. +Again( t&is &as no connection 0it&
C,$.-
A filesyste" layered on anot&er filesyste" for
arc&i.ing files. Ha doop Arc&i.es are typically
used for arc&i.ing files in HBCS to reduce t&e
na"enodeKs "e"ory usage.
ICS +!loud
,fs Store-
fs.1fs.Ios"osCileSyste" !loudStore +for"erly Ios"os filesyste"- is a
distri'uted filesys te" li1e HBCS or FoogleKs FCS(
0ritten in !^^. Cind "ore infor "ation a'out it at
htt p &//, os m osf s .so u rcef o rge .n e t/.
C,$ ftp
S? +nati.e-
sAn
S? +'loc1 sA
fs.ftp.C , $C il eSyst e "
fs.s?nati.e.Gati.eS?C
ileSys te"
fs.s?.S?CileSyste"
A filesyste" 'ac1ed 'y an C,$ ser.er.
A filesyste" 'ac1ed 'y A"a4on S?. See
h ttp &//w i, i.ap a c he .o rg /h a doop/4mazonA.
A filesyste" 'ac1ed 'y A"a4on S?( 0&ic& stores files in
'loc1s
'ased-
+ote" onJt go too much into the technical aspects of <adoop & <F-. Kou need to
understand the concept and derive from that an understanding of unloc1ing old
data. Fundamentall( these technologies allow for mining data that otherwise could
not be mined, simpl( b( supporting distributed computing and storage. )lso, use
the slides to %gure out how a ,ap!educe program wor1s and the various entities
involved.

You might also like