We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74
Distributed Computing
Distribution, Part II: The History of Distributed
Computing In the beginning... ● The first thing you probably think of is Mainframe Computing – That’s distributed right? – The computer’s over there, my terminal is over here… – There are many terminals, gotta be distributing something right? ● But this isn’t distributed computing, as all the compute is in one place. In the beginning…. ● Distribution first arose when you could have multiple computers as a single organisation. ● Problem is one of resource sharing (on ARPANET circa 1976 no less). ● Actually predates the TCP/IP stack. – Used NCP, the Network Control Program. ● Most RPC stacks were hack jobs for single purpose systems. Scalability? Who needs that? Xerox PARC ● Special projects research lab owned by Xerox (you’ll likely know them for their printers) ● Invented a Xerox specific RPC system for Xerox machines. ● This was based on Xerox’s understanding that the future of computing would have many computers per organisation. ● They also invented the first GUI. – A little company called Apple stole it though. Sun ONC RPC ● The year is 1984, and Sun Microsystems has not invented Java yet. ● They do have very cool Unix systems based on RISC architectures though. ● And they have a problem: – Hey, I need that file from over there, but I don’t want to pay for it to be shipped in floppy disk format to me. Surely I can use the company network to get it, right? Sun ONC RPC ● Sadly no, you could not, as remote file mounts had not been invented. – Actually remote anything was a bit of a far-fetched idea ● So naturally, Sun invented the Network File System (NFS) – The descendent of this system runs the home directory shares in the labs! ● This included a means of remotely using file systems (via RPC) called Open Network Computing Remote Procedure Call. Sun ONC RPC ● ONC RPC was wildly popular, as it was both open source (BSD) and was generic and well structured. ● Problem was this only defined a RPC protocol, not a library that actually did it. ● So unless you were using C, you were going to have a bad time. DCE/RPC ● In the early 1990’s, IBM got around to doing RPC properly – Of course they couldn’t do it themselves, so they got HP, DEC, and even Sun together to help out. ● The Open Software Foundation defined a new RPC framework called the Distributed Computing Environment. DCE/RPC ● DCE was super cool: – Included a common way of doing authentication – Had the first built in time service – Integrated DNS – Distributed File System – And a Remote Procedure Call system DCE/RPC ● DCE was super cool: – Included a common way of doing authentication – Had the first built in time service – Integrated DNS – Distributed File System – And a Remote Procedure Call system ● Wait a minute… That reminds me of Windows Domains…. DCE/RPC ● DCE is still just a guideline though – Didn’t really have anything greater than a C implementation – That said this was the days of Unix… ● Was hugely popular with larger organisations, especially now that the IBM PC was gaining serious traction. CORBA ● Common Object Request Broker Architecture ● What if you’re not using C? ● What if you don’t like using these big, bloated frameworks? ● What if you just want two dang programs to communicate on two computers? ● You use CORBA, that’s what you do! CORBA ● Directly competed with DCE ● Didn’t have any of that fancy pants authentication/time/file system addons ● Just let you define interfaces for computer programs to use. ● Actually was an integrated system for doing so (not just a set of guidelines and some C integrations) CORBA ● CORBA is built around the idea of Object Request Brokers (or ORBs) ● ORBs are a middleware service that allow languages to communicate over the network ● ORBs are designed to be cross compatible, regardless of architecture or underlying language. ● ORBs are represented as objects, which allows the system to hide nasty code inside classes. CORBA ● Objects that define interfaces to the internet? That sounds like a Component! – And CORBA agreed… eventually – Added support for all the bloat-features of DCE, but they were optional. ● CORBA had ORBs for each OO language – C++, Java, etc – Your connection objects simply inherited from whichever ORB class was present. CORBA ● Why was this all so cool? – There was still no standardised format for passing data around the internet. – CORBA provided one that was language and system independent. – It also provided a language independent means of writing the interfaces, meaning clients and servers were implementation independant! – CORBA is still around today (although not popular) CORBA IDL CORBA IDL Process CORBA Today ● Good idea, but there were problems – Spec was hugely complicated because the ORBs were written by different vendors. ● Who charged a lot ● ORBs turned out to not be as interoperable as promised – Competing less expenive frameworks killed off the project ● Java RMI was free and did the same thing by 1999 ● Also, Microsoft We’ve forgotten someone important Microsoft is Distributed Computing ● Microsoft has dominated distributed computing since the mid 90’s. ● This is because Microsoft has based their entire OS line around the idea of many computers in enormous distributed systems since Windows 3.11 with Workgroups ● This idea has been the key to Microsoft’s success throughout the years. DLLs ● The DLL is the fundamental building block of modern Windows systems ● Very similar to Unix Shared Object libraries – They are linked at run time – Can also be linked at compile time – Language neutral ● However, DLLs support Late Binding. DLL Late Binding ● The “killer feature” of DLLs is that functions can be bound by name – At run time, the OS can search the DLL for a specific function name ● This means that applications can check for missing DLLs and DLL compatibility issues at run time. This can avoid crashes and allows for dynamic coding. ● However, this is slower and there are no compile time checks. DLL Functions in C or C++ ● All declarations in DLLs are prefixed with __declspec(dllexport) – This includes all classes and functions ● An alternative way includes a .def file – This allowed for ordinal positions of functions – But this is not well used, and so not very popular DLL Definitions in C# ● Are just class libraries – Ie groups of classes that work together ● These have no special rules and can simply be compiled via Visual Studio. Calling Functions in DLLs ● Using C++/C, compile against header file and .lib file – The .lib file contains a stub to perform the DLL lookup ● Otherwise you need to use the Windows API – Example of this on the next slide – Different languages do this differently – COM DLLs must be handled differently – .NET DLLs need the .NET common language runtime DLL pros ● Exe files are smaller as DLLs are incorporated at run time – Disk space use is less too as you only need one DLL for many applications ● Can share in memory DLL code amongst all DLL apps ● Upgrading a DLL upgrades all client applications DLL cons ● Versions of the DLLs used by an application must be compatible with each other and the application – Bad upgrades can break every app that uses it ● Dependencies are outside of the compiled application ● Security issues exist with “by name” access – Name clashes? DLLs today ● Very old by component standards – Have existed since OS2 times. ● More a component container system ● DLLs can be normal, COM, or .NET components. – Modern .NET systems allow all compiled code to act as DLLs. Even EXEs! ● So you will probably use DLLs in industry. What is COM? ● COM: Component Object Model ● Also known by it’s cool rebranded name ActiveX ● Developed out of Microsofts Object Linking and Embedding architecture (OLE) – OLE allowed one application to host objects from another – This is what lets you embed Excel spreadsheets in Word. What is COM? ● COM is OLE extended via CORBA lines – Interfaces defined by Microsoft’s IDL, MIDL – Interface based RPC (called DCOM) – Name server (the Windows registry) ● Allows for lookup by GUID rather than name ● This is hideous, but allows for unique component lookup by version/system/machine. ● Eg: f943b44a-0d95-45e3-90c5-34e841c531b2 ● Seperated into Interface GUIDS (IIDs) and Class GUIDS (CLSIDs) COM GUIDs ● Interfaces via their IID are unbreakable contracts – This guarantees that clients can rely on them forever. ● Problem: Interfaces change all the time – Every change of any kind needs a new IID. – This results in huge logistical problems in COM projects. DCOM ● Distributed computing was added to COM – COM was just initially for OLE use. ● DCOM works much like COM, it just uses DCE/ RPC to perform COM requests over a network interface. ● DCOM completely dominated DCE via Microsoft’s ever popular EEE approach. ● This is still the underlying system behind all Windows Networks today. COM GUIs ● Microsoft used COM to allow users to embed GUI elements into other applications. ● This allows for really easy extensibility of Microsoft programs, without needing to know how the underlying code works. ● This could be generalised to any component in a container. ● This was eventually renamed to ActiveX ActiveX ● ActiveX directly competed with Java applets. ● Microsoft allowed ActiveX integration with IE – This was a terrible, terrible idea. ● ActiveX implements a standard component interface – IOleObject – defines parameters of GUI controls – IDispatch – allows functions to be called by name. ● This was also a terrible idea. COM Today ● Still the core of Windows networks. ● Very outdated, .NET is the king of the Windows Environment these days. – However, lots of COM still exists, so .NET and COM have a very well defined interface ● Microsoft continues to push .NET and the general concept of Web Services out into the world. – However, Google/Amazon has stolen their ideas and taken their crown. Java RMI ● In the late 1990s, Java arrived, and brought with it Sun Microsystem’s RPC knowledge. ● Enterprise Java had a thing called RMI. – Normal Java has it too these days ● Remote Method Invocation allows for RPC calls without any non-language tools. Java RMI ● Like CORBA, uses a defined interface. ● Unlike CORBA, this is entirely defined in Java – Using an…. Interface. – Needs to extend java.rmi.Remote interface. – Then create stub classes from that, and follow CORBA process from that point. Java RMI ● Like CORBA, inheritance is used to hide the nasty stuff. – Server object inherits from UnicastRemoteObject – Again, no IDL class required. ● Java also has a name service for finding components – Called rmiregistry. – It’s a command line program. ● Problem: RMI has no inbuilt security integration. Java RMI Today ● Java RMI is still used today ● It works pretty well, and provides an all-in-one, no frills approach to component distribution. ● The only problem is, it’s Java. – And therefore kind of stands alone. .NET ● Microsoft very much liked the idea of Java’s VM based, universally compatible features. – Microsoft tried to make a Java implementation in 1996. – Sun actually sued Microsoft for not following the spec. ● Eventually though, Microsoft decided to build their own Java like system. – This was named .NET, and the native language C# The .NET CLR ● Works like JavaVM – Compiles source code to machine-independant byte code (the Common Intermediate Language) – Performs memory management and integrates the underlying OS. – Converts byte code into platform specific executable code via a JIT (Just in Time) compiler. – Both allow multiple lanuages provided they can convert to the CIL. CLR CIL ● Code that compiles to CIL is called managed code and is managed by the .NET framework – Better security cause no pointers – Platform independence via .NET VM – However, slower due to JIT compilation ● This is very nearly not a problem these days due to a lot of paravirtualization. CLR Non-CIL ● Code not supported by the CIL is called unmanaged code (also unsafe or native code) – Less security – Generally speaking limited in languages (to C++) – C++ and C# both can allow for managed and unmanaged code in the same application ● Although this is discouraged and will be penalised if you do it in this unit. ● Basically there should be very nearly no reason to do this. .NET Remoting ● .NET Remoting is a system that essentially replaces DCOM for .NET ● Is, unsurprisingly, very similar to RMI ● However, there is no IDL or visible proxy code – It’s all hidden in the .NET backend. – Remotely-callable server objects must derive from MarshallByRefObject. – The server object’s public methods are the RPC interface. (very cool) .NET Remoting ● The client must reference the server assembly (EXE/DLL) – The client needs access to the metadata of the object (kind of like IDL). – .NET does this by referencing the server object. ● This is kind of like including a header file, but with a lot of background magic – This can be avoided with class factories. .NET Remoting Today ● Mostly a legacy system, as Microsoft has a newer Web Services compatible .NET RPC framework called WCF. ● Remoting is still relevant because: – Remoting does not require a web server – Remoting supports binary message formats (which are always more efficient than XML/JSON systems) ● WCF combines Remoting with Web Services – And a healthy dose of automagic coding. .NET WCF ● The Windows Communications Framework (WCF) is an extension of .NET Remoting. ● More like RMI as it uses an interface class. ● MarshalByRefObject now replaced by [ServiceContract] and [OperationContract] attributes. ● Tons more automatic code generation. ● Still pretty much the same as older RPC frameworks. Examples! ● For completeness sake, lets look at some examples. ● These could be useful in a tutorial or something…. What are we building? ● A Calculator! – More specifically, a calculator add function. ● Why on earth are we distributing this? – This may be dumb, but makes the code simple and lets us focus on the similarities and differences – Also gives you an idea how easy it is. ● Examples of code are very useful as you progress through industry! Keep these somewhere! Some Generic IDL C++ Server DLL C++ Client COM Component ● We’re not going to include COM. ● COM is for all practical purposes deprecated – Has been since before Windows XP. – It’s very ugly in implementation – We’ll be using .NET exclusively…. Soooo…. ● Moving on. CORBA – Java (Server) CORBA – Java (Client) Java RMI Interface Java RMI Server Java RMI Client Fun fact about RMI ● Java RMI’s biggest problem is that it is super tightly integrated with Java ● For example: – The RMI client actually doesn’t have the stub code for the server. – Instead, it downloads it from the server on first connect. ● Both versions of Java must be exactly the same. ● This has implications for security too, as it must trust the code it downloads. .NET Remoting Server .NET Remoting Client Fun facts about .NET Remoting ● You may have noticed that we didn’t explicitly create an instance of the server object. ● Instead we quite lazily registers the server’s class. ● .NET loves the idea of making object creation an RPC too! ● This is cool and all, but can result in code errors where you create a client side version of the server side object. – This is very hard to detect .NET WCF Server .NET WCF Client Some useful things for WCF ● You’ll have noticed [ServiceContract] [OperationContract] and [ServiceBehavior] attributes. – Just remember, Contracts for the Interface, Behavior for the implementation. ● You need to build a class factory to use a WCF interface – Factories are classes that build other classes – Really just here cause Microsoft found it was a popular approach to RPC. Some More Useful Things for WCF ● ServiceBehavior has a lot of fields. ● What we’re doing is overriding Microsoft’s default single threaded automatically synchronised system. – Why? Because it’s really inefficient. And because we like taking our lives into our hands. ● Basically, Microsoft will often assume that you mean single threaded by default – This is very important, as a lot of programmers come to Windows first. – But it sucks for us, so we’ll be overriding a lot. More WCF stuff? ● Also, you can’t pass RPC objects via reference. – Why? Because WCF is service oriented, and so it wants to force you as the client to come to it. – This fixes a lot of OO problems over the network. – Objects can be passed by value though. ● These aren’t server objects though, they’re data objects. Why do people hate old systems? ● Why are these older systems falling out of favor? – Firewalls (block a lot of ports to stop hackers) – Configuration overheads (gotta tell clients where servers are, and COM’s GUIDS make changes very expensive) – Proprietary – And because the Internet ● Seriously, why don’t we just use HTTP? Next Week ● The tiering system of basic distributed systems! – You will have some idea of this from this week’s tutorial ● Asynchronous Communications ● Statelessness!
Docker: The Complete Guide to the Most Widely Used Virtualization Technology. Create Containers and Deploy them to Production Safely and Securely.: Docker & Kubernetes, #1