0% found this document useful (0 votes)
18 views74 pages

Lecture 2

Uploaded by

testuser.upview
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views74 pages

Lecture 2

Uploaded by

testuser.upview
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Distributed Computing

Distribution, Part II: The History of Distributed


Computing
In the beginning...

The first thing you probably think of is
Mainframe Computing
– That’s distributed right?
– The computer’s over there, my terminal is over
here…
– There are many terminals, gotta be distributing
something right?

But this isn’t distributed computing, as all the
compute is in one place.
In the beginning….

Distribution first arose when you could have
multiple computers as a single organisation.

Problem is one of resource sharing (on
ARPANET circa 1976 no less).

Actually predates the TCP/IP stack.
– Used NCP, the Network Control Program.

Most RPC stacks were hack jobs for single
purpose systems.
Scalability? Who needs that?
Xerox PARC

Special projects research lab owned by Xerox (you’ll
likely know them for their printers)

Invented a Xerox specific RPC system for Xerox
machines.

This was based on Xerox’s understanding that the
future of computing would have many computers per
organisation.

They also invented the first GUI.
– A little company called Apple stole it though.
Sun ONC RPC

The year is 1984, and Sun Microsystems has
not invented Java yet.

They do have very cool Unix systems based on
RISC architectures though.

And they have a problem:
– Hey, I need that file from over there, but I don’t want
to pay for it to be shipped in floppy disk format to
me. Surely I can use the company network to get it,
right?
Sun ONC RPC

Sadly no, you could not, as remote file mounts had
not been invented.
– Actually remote anything was a bit of a far-fetched idea

So naturally, Sun invented the Network File System
(NFS)
– The descendent of this system runs the home directory
shares in the labs!

This included a means of remotely using file
systems (via RPC) called Open Network Computing
Remote Procedure Call.
Sun ONC RPC

ONC RPC was wildly popular, as it was both
open source (BSD) and was generic and well
structured.

Problem was this only defined a RPC protocol,
not a library that actually did it.

So unless you were using C, you were going to
have a bad time.
DCE/RPC

In the early 1990’s, IBM got around to doing
RPC properly
– Of course they couldn’t do it themselves, so they
got HP, DEC, and even Sun together to help out.

The Open Software Foundation defined a new
RPC framework called the Distributed
Computing Environment.
DCE/RPC

DCE was super cool:
– Included a common way of doing authentication
– Had the first built in time service
– Integrated DNS
– Distributed File System
– And a Remote Procedure Call system
DCE/RPC

DCE was super cool:
– Included a common way of doing authentication
– Had the first built in time service
– Integrated DNS
– Distributed File System
– And a Remote Procedure Call system

Wait a minute… That reminds me of Windows
Domains….
DCE/RPC

DCE is still just a guideline though
– Didn’t really have anything greater than a C
implementation
– That said this was the days of Unix…

Was hugely popular with larger organisations,
especially now that the IBM PC was gaining
serious traction.
CORBA

Common Object Request Broker Architecture

What if you’re not using C?

What if you don’t like using these big, bloated
frameworks?

What if you just want two dang programs to
communicate on two computers?

You use CORBA, that’s what you do!
CORBA

Directly competed with DCE

Didn’t have any of that fancy pants
authentication/time/file system addons

Just let you define interfaces for computer
programs to use.

Actually was an integrated system for doing so
(not just a set of guidelines and some C
integrations)
CORBA

CORBA is built around the idea of Object
Request Brokers (or ORBs)

ORBs are a middleware service that allow
languages to communicate over the network

ORBs are designed to be cross compatible,
regardless of architecture or underlying
language.

ORBs are represented as objects, which allows
the system to hide nasty code inside classes.
CORBA

Objects that define interfaces to the internet?
That sounds like a Component!
– And CORBA agreed… eventually
– Added support for all the bloat-features of DCE, but
they were optional.

CORBA had ORBs for each OO language
– C++, Java, etc
– Your connection objects simply inherited from
whichever ORB class was present.
CORBA

Why was this all so cool?
– There was still no standardised format for passing
data around the internet.
– CORBA provided one that was language and
system independent.
– It also provided a language independent means of
writing the interfaces, meaning clients and servers
were implementation independant!
– CORBA is still around today (although not popular)
CORBA IDL
CORBA IDL Process
CORBA Today

Good idea, but there were problems
– Spec was hugely complicated because the ORBs
were written by different vendors.

Who charged a lot

ORBs turned out to not be as interoperable as promised
– Competing less expenive frameworks killed off the
project

Java RMI was free and did the same thing by 1999

Also, Microsoft
We’ve forgotten someone important
Microsoft is Distributed Computing

Microsoft has dominated distributed computing
since the mid 90’s.

This is because Microsoft has based their entire
OS line around the idea of many computers in
enormous distributed systems since Windows
3.11 with Workgroups

This idea has been the key to Microsoft’s
success throughout the years.
DLLs

The DLL is the fundamental building block of
modern Windows systems

Very similar to Unix Shared Object libraries
– They are linked at run time
– Can also be linked at compile time
– Language neutral

However, DLLs support Late Binding.
DLL Late Binding

The “killer feature” of DLLs is that functions can be
bound by name
– At run time, the OS can search the DLL for a specific
function name

This means that applications can check for missing
DLLs and DLL compatibility issues at run time.
This can avoid crashes and allows for dynamic
coding.

However, this is slower and there are no compile
time checks.
DLL Functions in C or C++

All declarations in DLLs are prefixed with
__declspec(dllexport)
– This includes all classes and functions

An alternative way includes a .def file
– This allowed for ordinal positions of functions
– But this is not well used, and so not very popular
DLL Definitions in C#

Are just class libraries
– Ie groups of classes that work together

These have no special rules and can simply be
compiled via Visual Studio.
Calling Functions in DLLs

Using C++/C, compile against header file and .lib
file
– The .lib file contains a stub to perform the DLL lookup

Otherwise you need to use the Windows API
– Example of this on the next slide
– Different languages do this differently
– COM DLLs must be handled differently
– .NET DLLs need the .NET common language runtime
DLL pros

Exe files are smaller as DLLs are incorporated
at run time
– Disk space use is less too as you only need one
DLL for many applications

Can share in memory DLL code amongst all
DLL apps

Upgrading a DLL upgrades all client
applications
DLL cons

Versions of the DLLs used by an application
must be compatible with each other and the
application
– Bad upgrades can break every app that uses it

Dependencies are outside of the compiled
application

Security issues exist with “by name” access
– Name clashes?
DLLs today

Very old by component standards
– Have existed since OS2 times.

More a component container system

DLLs can be normal, COM, or .NET
components.
– Modern .NET systems allow all compiled code to
act as DLLs. Even EXEs!

So you will probably use DLLs in industry.
What is COM?

COM: Component Object Model

Also known by it’s cool rebranded name
ActiveX

Developed out of Microsofts Object Linking and
Embedding architecture (OLE)
– OLE allowed one application to host objects from
another
– This is what lets you embed Excel spreadsheets in
Word.
What is COM?

COM is OLE extended via CORBA lines
– Interfaces defined by Microsoft’s IDL, MIDL
– Interface based RPC (called DCOM)
– Name server (the Windows registry)

Allows for lookup by GUID rather than name

This is hideous, but allows for unique component lookup
by version/system/machine.

Eg: f943b44a-0d95-45e3-90c5-34e841c531b2

Seperated into Interface GUIDS (IIDs) and Class GUIDS
(CLSIDs)
COM GUIDs

Interfaces via their IID are unbreakable
contracts
– This guarantees that clients can rely on them
forever.

Problem: Interfaces change all the time
– Every change of any kind needs a new IID.
– This results in huge logistical problems in COM
projects.
DCOM

Distributed computing was added to COM
– COM was just initially for OLE use.

DCOM works much like COM, it just uses DCE/
RPC to perform COM requests over a network
interface.

DCOM completely dominated DCE via
Microsoft’s ever popular EEE approach.

This is still the underlying system behind all
Windows Networks today.
COM GUIs

Microsoft used COM to allow users to embed
GUI elements into other applications.

This allows for really easy extensibility of
Microsoft programs, without needing to know
how the underlying code works.

This could be generalised to any component in
a container.

This was eventually renamed to ActiveX
ActiveX

ActiveX directly competed with Java applets.

Microsoft allowed ActiveX integration with IE
– This was a terrible, terrible idea.

ActiveX implements a standard component
interface
– IOleObject – defines parameters of GUI controls
– IDispatch – allows functions to be called by name.

This was also a terrible idea.
COM Today

Still the core of Windows networks.

Very outdated, .NET is the king of the Windows
Environment these days.
– However, lots of COM still exists, so .NET and COM
have a very well defined interface

Microsoft continues to push .NET and the general
concept of Web Services out into the world.
– However, Google/Amazon has stolen their ideas and
taken their crown.
Java RMI

In the late 1990s, Java arrived, and brought
with it Sun Microsystem’s RPC knowledge.

Enterprise Java had a thing called RMI.
– Normal Java has it too these days

Remote Method Invocation allows for RPC calls
without any non-language tools.
Java RMI

Like CORBA, uses a defined interface.

Unlike CORBA, this is entirely defined in Java
– Using an…. Interface.
– Needs to extend java.rmi.Remote interface.
– Then create stub classes from that, and follow
CORBA process from that point.
Java RMI

Like CORBA, inheritance is used to hide the nasty
stuff.
– Server object inherits from UnicastRemoteObject
– Again, no IDL class required.

Java also has a name service for finding components
– Called rmiregistry.
– It’s a command line program.

Problem: RMI has no inbuilt security integration.
Java RMI Today

Java RMI is still used today

It works pretty well, and provides an all-in-one,
no frills approach to component distribution.

The only problem is, it’s Java.
– And therefore kind of stands alone.
.NET

Microsoft very much liked the idea of Java’s VM
based, universally compatible features.
– Microsoft tried to make a Java implementation in
1996.
– Sun actually sued Microsoft for not following the
spec.

Eventually though, Microsoft decided to build
their own Java like system.
– This was named .NET, and the native language C#
The .NET CLR

Works like JavaVM
– Compiles source code to machine-independant
byte code (the Common Intermediate Language)
– Performs memory management and integrates the
underlying OS.
– Converts byte code into platform specific
executable code via a JIT (Just in Time) compiler.
– Both allow multiple lanuages provided they can
convert to the CIL.
CLR CIL

Code that compiles to CIL is called managed
code and is managed by the .NET framework
– Better security cause no pointers
– Platform independence via .NET VM
– However, slower due to JIT compilation

This is very nearly not a problem these days due to a lot
of paravirtualization.
CLR Non-CIL

Code not supported by the CIL is called
unmanaged code (also unsafe or native code)
– Less security
– Generally speaking limited in languages (to C++)
– C++ and C# both can allow for managed and
unmanaged code in the same application

Although this is discouraged and will be penalised if you
do it in this unit.

Basically there should be very nearly no reason to do
this.
.NET Remoting

.NET Remoting is a system that essentially
replaces DCOM for .NET

Is, unsurprisingly, very similar to RMI

However, there is no IDL or visible proxy code
– It’s all hidden in the .NET backend.
– Remotely-callable server objects must derive from
MarshallByRefObject.
– The server object’s public methods are the RPC
interface. (very cool)
.NET Remoting

The client must reference the server assembly
(EXE/DLL)
– The client needs access to the metadata of the
object (kind of like IDL).
– .NET does this by referencing the server object.

This is kind of like including a header file, but with a lot of
background magic
– This can be avoided with class factories.
.NET Remoting Today

Mostly a legacy system, as Microsoft has a
newer Web Services compatible .NET RPC
framework called WCF.

Remoting is still relevant because:
– Remoting does not require a web server
– Remoting supports binary message formats (which
are always more efficient than XML/JSON systems)

WCF combines Remoting with Web Services
– And a healthy dose of automagic coding.
.NET WCF

The Windows Communications Framework
(WCF) is an extension of .NET Remoting.

More like RMI as it uses an interface class.

MarshalByRefObject now replaced by
[ServiceContract] and [OperationContract]
attributes.

Tons more automatic code generation.

Still pretty much the same as older RPC
frameworks.
Examples!

For completeness sake, lets look at some
examples.

These could be useful in a tutorial or
something….
What are we building?

A Calculator!
– More specifically, a calculator add function.

Why on earth are we distributing this?
– This may be dumb, but makes the code simple and
lets us focus on the similarities and differences
– Also gives you an idea how easy it is.

Examples of code are very useful as you
progress through industry! Keep these
somewhere!
Some Generic IDL
C++ Server DLL
C++ Client
COM Component

We’re not going to include COM.

COM is for all practical purposes deprecated
– Has been since before Windows XP.
– It’s very ugly in implementation
– We’ll be using .NET exclusively…. Soooo….

Moving on.
CORBA – Java (Server)
CORBA – Java (Client)
Java RMI Interface
Java RMI Server
Java RMI Client
Fun fact about RMI

Java RMI’s biggest problem is that it is super
tightly integrated with Java

For example:
– The RMI client actually doesn’t have the stub code
for the server.
– Instead, it downloads it from the server on first
connect.

Both versions of Java must be exactly the same.

This has implications for security too, as it must trust the
code it downloads.
.NET Remoting Server
.NET Remoting Client
Fun facts about .NET Remoting

You may have noticed that we didn’t explicitly create
an instance of the server object.

Instead we quite lazily registers the server’s class.

.NET loves the idea of making object creation an RPC
too!

This is cool and all, but can result in code errors where
you create a client side version of the server side
object.
– This is very hard to detect
.NET WCF Server
.NET WCF Client
Some useful things for WCF

You’ll have noticed [ServiceContract]
[OperationContract] and [ServiceBehavior]
attributes.
– Just remember, Contracts for the Interface, Behavior
for the implementation.

You need to build a class factory to use a WCF
interface
– Factories are classes that build other classes
– Really just here cause Microsoft found it was a popular
approach to RPC.
Some More Useful Things for WCF

ServiceBehavior has a lot of fields.

What we’re doing is overriding Microsoft’s default
single threaded automatically synchronised system.
– Why? Because it’s really inefficient. And because we like
taking our lives into our hands.

Basically, Microsoft will often assume that you mean
single threaded by default
– This is very important, as a lot of programmers come to
Windows first.
– But it sucks for us, so we’ll be overriding a lot.
More WCF stuff?

Also, you can’t pass RPC objects via reference.
– Why? Because WCF is service oriented, and so it
wants to force you as the client to come to it.
– This fixes a lot of OO problems over the network.
– Objects can be passed by value though.

These aren’t server objects though, they’re data objects.
Why do people hate old systems?

Why are these older systems falling out of
favor?
– Firewalls (block a lot of ports to stop hackers)
– Configuration overheads (gotta tell clients where
servers are, and COM’s GUIDS make changes very
expensive)
– Proprietary
– And because the Internet

Seriously, why don’t we just use HTTP?
Next Week

The tiering system of basic distributed systems!
– You will have some idea of this from this week’s
tutorial

Asynchronous Communications

Statelessness!

You might also like