0% found this document useful (0 votes)

140 views

Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 4

The document discusses programming models for grid and cloud computing. It describes the Globus Toolkit (GT4) architecture and components, including the Global Resource Allocation Manager (GRAM) and Grid Security Infrastructure (GSI). It also discusses Hadoop, an open-source framework for distributed storage and processing of large datasets across clusters of commodity hardware. Key aspects of Hadoop covered include MapReduce, the Hadoop Distributed File System (HDFS), and its ability to reliably process large amounts of data in a parallel and scalable manner.

Uploaded by

Suban Ravichandran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views

Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 4

Uploaded by

Suban Ravichandran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

CS6703 GRID AND CLOUD COMPUTING

Unit 4

Dr Gnanasekaran Thangavel
Professor and Head
Faculty of Information Technology
R M K College of Engineering and Technology
UNIT IV PROGRAMMING MODEL
Open source grid middleware packages – Globus Toolkit
(GT4) Architecture , Configuration – Usage of Globus –
Main components and Programming model -
Introduction to Hadoop Framework - Mapreduce, Input
splitting, map and reduce functions, specifying input and
output parameters, configuring and running a job – Design
of Hadoop file system, HDFS concepts, command line and
java interface, dataflow of File read & File write.

2 Dr Gnanasekaran Thangavel 09/20/2022

Open source grid middleware packages

The Open Grid Forum and Object Management are two well- formed organizations behind
the standards
Middleware is the software layer that connects software components. It lies between
operating system and the applications.
Grid middleware is specially designed a layer between hardware and software, enable the
sharing of heterogeneous resources and managing virtual organizations created around the
grid.
The popular grid middleware are
1. BOINC -Berkeley Open Infrastructure for Network Computing.
2. UNICORE - Middleware developed by the German grid computing community.
3. Globus (GT4) - A middleware library jointly developed by Argonne National Lab., Univ.
of Chicago, and USC Information Science Institute, funded by DARPA, NSF, and NIH.
4. CGSP in ChinaGrid - The CGSP (ChinaGrid Support Platform) is a middleware library
developed by 20 top universities in China as part of the ChinaGrid Project
3 Dr Gnanasekaran Thangavel 09/20/2022
Open source grid middleware packages conti…

5. Condor-G - Originally developed at the Univ. of Wisconsin for general

distributed computing, and later extended to Condor-G for grid job
management.
6. Sun Grid Engine (SGE) - Developed by Sun Microsystems for business grid
applications. Applied to private grids and local clusters within enterprises or
campuses.
7. gLight -Born from the collaborative efforts of more than 80 people in 12
different academic and industrial research centers as part of the EGEE
Project, gLite provided a framework for building grid applications tapping
into the power of distributed computing and storage resources across the
Internet.
4 Dr Gnanasekaran Thangavel 09/20/2022
The Globus Toolkit Architecture (GT4)
The Globus Toolkit, is an open middleware library for the grid computing communities. These open
source software libraries support many operational grids and their applications on an international
basis.
The toolkit addresses common problems and issues related to grid resource discovery, management,
communication, security, fault detection, and portability. The software itself provides a variety of
components and capabilities.
The library includes a rich set of service implementations. The implemented software supports grid
infrastructure management, provides tools for building new web services in Java , C, and Python,
builds a powerful standard-based security infrastructure and client API s (in different languages),
and offers comprehensive command-line programs for accessing various grid services. T
The Globus Toolkit was initially motivated by a desire to remove obstacles that prevent seamless
collaboration, and thus sharing of resources and services, in scientific and engineering applications.
The shared resources can be computers, storage, data, services, networks, science instruments (e.g.,
sensors), and so on. The Globus library version GT4, is conceptually shown in Figure

5 Dr Gnanasekaran Thangavel 09/20/2022

The Globus Toolkit

6 Dr Gnanasekaran Thangavel 09/20/2022

The Globus Toolkit
GT4 offers the middle-level core services in grid applications.
The high-level services and tools, such as MPI , Condor-G, and
Nirod/G, are developed by third parties for general purpose
distributed computing applications.
The local services, such as LSF, TCP, Linux, and Condor, are at
the boom level and are fundamental tools supplied by other
developers.
As a de facto standard in grid middleware, GT4 is based on
industry-standard web service technologies.
7 Dr Gnanasekaran Thangavel 09/20/2022
Functionalities of GT4
Global Resource Allocation Manager (GRAM ) - Grid Resource Access and
Management (HTTP-based)
Communication (Nexus ) - Unicast and multicast communication
Grid Security Infrastructure (GSI ) - Authentication and related security services
Monitory and Discovery Service (MDS ) - Distributed access to structure and
state information
Health and Status (HBM ) - Heartbeat monitoring of system components
Global Access of Secondary Storage (GASS ) - Grid access of data in remote
secondary storage
Grid File Transfer (GridFTP ) Inter-node fast file transfer

8 Dr Gnanasekaran Thangavel 09/20/2022

Globus Job Workflow

9 Dr Gnanasekaran Thangavel 09/20/2022

Globus Job Workflow
A typical job execution sequence proceeds as follows: The user delegates his credentials
to a delegation service.
The user submits a j ob request to GRAM with the delegation identifier as a parameter.
 GRAM parses the request, retrieves the user proxy certificate from the delegation
service, and then acts on behalf of the user.
GRAM sends a transfer request to the RFT (Reliable File Transfer), which applies
GridFTP to bring in the necessary files.
GRAM invokes a local scheduler via a GRAM adapter and the SEG (Scheduler Event
Generator) initiates a set of user j obs.
The local scheduler reports the j ob state to the SEG. Once the j ob is complete, GRAM
uses RFT and GridFTP to stage out the resultant files. The grid monitors the progress of
these operations and sends the user a notification

10 Dr Gnanasekaran Thangavel 09/20/2022

Client-Globus Interactions

There are strong interactions between provider programs and user code. GT4 makes heavy use of industry-
standard web service protocols and mechanisms in service description, discovery, access, authentication,
authorization.
GT4 makes extensive use of java, C, and Python to write user code. Web service mechanisms define
specific interfaces for grid computing.
Web services provide flexible, extensible, and widely adopted XML-based interfaces.

11 Dr Gnanasekaran Thangavel 09/20/2022

Data Management Using GT4
Grid applications one need to provide access to and/or integrate large quantities of data at multiple sites.
The GT4 tools can be used individually or in conj unction with other tools to develop interesting
solutions to efficient data access. The following list briefly introduces these GT4 tools:
1. Grid FTP supports reliable, secure, and fast memory-to-memory and disk-to-disk data movement over
high-bandwidth WANs. Based on the popular FTP protocol for internet file transfer, Grid FTP adds
additional features such as parallel data transfer, third-party data transfer, and striped data transfer. I n
addition, Grid FTP benefits from using the strong Globus Security Infra structure for securing data
channels with authentication and reusability. It has been reported that the grid has achieved 27
Gbit/second end-to-end transfer speeds over some WANs.
2. RFT provides reliable management of multiple Grid FTP transfers. I t has been used to orchestrate the
transfer of millions of files among many sites simultaneously.
3. RLS (Replica Location Service) is a scalable system for maintaining and providing access to
information about the location of replicated files and data sets.
4. OGSA-DAI (Globus Data Access and Integration) tools were developed by the UK eScience program
and provide access to relational and XML databases.
12 Dr Gnanasekaran Thangavel 09/20/2022
Apache top level project, open-source implementation of frameworks for reliable, scalable,
distributed computing and data storage.
It is a flexible and highly-available architecture for large scale computation and data
processing on a network of commodity hardware.
Hadoop offers a software platform that was originally developed by a Yahoo! group. The
package enables users to write and run applications over vast amounts of distributed data.
 Users can easily scale Hadoop to store and process petabytes of data in the web space.
Hadoop is economical in that it comes with an open source version of MapReduce that
minimizes overhead in task spawning and massive data communication.
It is efficient, as it processes data with a high degree of parallelism across a large number of
commodity nodes, and it is reliable in that it automatically keeps multiple data copies to
facilitate redeployment of computing tasks upon unexpected system failures.
13 Dr Gnanasekaran Thangavel 09/20/2022
Hadoop
• Hadoop:

• an open-source software framework that supports data-intensive distributed applications,

licensed under the Apache v2 license.

• Goals / Requirements:

• Abstract and facilitate the storage and processing of large and/or rapidly growing data sets
• Structured and non-structured data
• Simple programming models

• High scalability and availability

• Use commodity (cheap!) hardware with little redundancy

• Fault-tolerance

14 •Dr Gnanasekaran Thangavel rather than data

Move computation 09/20/2022
Hadoop Framework Tools

15 Dr Gnanasekaran Thangavel 09/20/2022

Hadoop’s Architecture
Distributed, with some centralization

Main nodes of cluster are where most of the computational power and storage of the
system lies

Main nodes run TaskTracker to accept and reply to MapReduce tasks, and also
DataNode to store needed blocks closely as possible

Central control node runs NameNode to keep track of HDFS directories & files, and
JobTracker to dispatch compute tasks to TaskTracker

Written in Java, also supports Python and Ruby

16 Dr Gnanasekaran Thangavel 09/20/2022

Hadoop’s Architecture  Hadoop Distributed
Filesystem
 Tailored to needs of
MapReduce
 Targeted towards many
reads of filestreams
 Writes are more costly
 High degree of data
replication (3x by
default)
 No need for RAID on
normal nodes
 Large blocksize
(64MB)
 Location awareness of
17 Dr Gnanasekaran Thangavel 09/20/2022
DataNodes in network
Hadoop’s Architecture

NameNode:
Stores metadata for the files, like the directory structure of a typical FS.
The server holding the NameNode instance is quite crucial, as there is only one.
Transaction log for file deletes/adds, etc. Does not use transactions for whole blocks
or file-streams, only metadata.
Handles creation of more replica blocks when necessary after a DataNode failure
DataNode:
• Stores the actual data in HDFS
• Can run on any underlying filesystem (ext3/4, NTFS, etc)
• Notifies NameNode of what blocks it has
• NameNode replicates blocks 2x in local rack, 1x elsewhere
18 Dr Gnanasekaran Thangavel 09/20/2022
Hadoop’s Architecture

19 Dr Gnanasekaran Thangavel 09/20/2022

Hadoop’s Architecture
MapReduce Engine:
JobTracker & TaskTracker
JobTracker splits up data into smaller tasks(“Map”) and sends it to the
TaskTracker process in each node
TaskTracker reports back to the JobTracker node and reports on job
progress, sends data (“Reduce”) or requests new jobs
None of these components are necessarily limited to using HDFS
Many other distributed file-systems with quite different architectures work
Many other software packages besides Hadoop's MapReduce platform make
use of HDFS

20 Dr Gnanasekaran Thangavel 09/20/2022

Hadoop in the Wild
Hadoop is in use at most organizations that handle big data:
Yahoo!
Facebook
Amazon
Netflix
Etc…

Some examples of scale:

Yahoo!’s Search Webmap runs on 10,000 core Linux cluster and powers Yahoo! Web search

FB’s Hadoop cluster hosts 100+ PB of data (July, 2012) & growing at ½ PB/day (Nov,
2012)

21 Dr Gnanasekaran Thangavel 09/20/2022

Three main applications of Hadoop
Advertisement (Mining user behavior to generate recommendations)

Searches (group related documents)

Security (search for uncommon patterns)

22 Dr Gnanasekaran Thangavel 09/20/2022

Hadoop Highlights
Distributed File System
Fault Tolerance
Open Data Format
Flexible Schema
Queryable Database
Why use Hadoop?
Need to process Multi Petabyte Datasets
Data may not have strict schema
Expensive to build reliability in each application
Nodes fails everyday
Need common infrastructure
Very Large Distributed File System
Assumes Commodity Hardware
Optimized for Batch Processing
Runs on heterogeneous OS
23 Dr Gnanasekaran Thangavel 09/20/2022
DataNode

A Block Sever
Stores data in local file system
Stores meta-data of a block - checksum
Serves data and meta-data to clients
Block Report
Periodically sends a report of all existing
blocks to NameNode
Facilitate Pipelining of Data
Forwards data to other specified DataNodes

24 Dr Gnanasekaran Thangavel 09/20/2022

Block Placement

Replication Strategy
One replica on local node
Second replica on a remote rack
Third replica on same remote rack
Additional replicas are randomly placed
Clients read from nearest replica

25 Dr Gnanasekaran Thangavel 09/20/2022

Data Correctness

Use Checksums to validate data – CRC32

File Creation
Client computes checksum per 512 byte
DataNode stores the checksum
File Access
Client retrieves the data and checksum from DataNode
If validation fails, client tries other replicas

26 Dr Gnanasekaran Thangavel 09/20/2022

Data Pipelining

Client retrieves a list of DataNodes on which to place

replicas of a block
Client writes block to the first DataNode
The first DataNode forwards the data to the next
DataNode in the Pipeline
When all replicas are written, the client moves on to write
the next block in file

27 Dr Gnanasekaran Thangavel 09/20/2022

MapReduce Usage

Log processing
Web search indexing
Ad-hoc queries

28 Dr Gnanasekaran Thangavel 09/20/2022

MapReduce Process (org.apache.hadoop.mapred)
JobClient
Submit job
JobTracker
Manage and schedule job, split job into tasks
TaskTracker
Start and monitor the task execution
Child
The process that really execute the task

29 Dr Gnanasekaran Thangavel 09/20/2022

Inter Process Communication
Protocol
JobSubmissionProtocol

JobClient <-------------> JobTracker

InterTrackerProtocol
TaskTracker <------------> JobTracker

TaskUmbilicalProtocol
TaskTracker <-------------> Child
JobTracker impliments both protocol and works as server in
both IPC
TaskTracker implements the TaskUmbilicalProtocol; Child
gets task information and reports task status through it.
30 Dr Gnanasekaran Thangavel 09/20/2022
JobClient.submitJob - 1

Check input and output, e.g. check if the output directory

is already existing
job.getInputFormat().validateInput(job);
job.getOutputFormat().checkOutputSpecs(fs, job);
Get InputSplits, sort, and write output to HDFS
InputSplit[] splits = job.getInputFormat().
getSplits(job, job.getNumMapTasks());
writeSplitsFile(splits, out); // out is
$SYSTEMDIR/$JOBID/job.split

31 Dr Gnanasekaran Thangavel 09/20/2022

JobClient.submitJob - 2

The jar file and configuration file will be uploaded to

HDFS system directory
job.write(out); // out is $SYSTEMDIR/$JOBID/job.xml
JobStatus status = jobSubmitClient.submitJob(jobId);
This is an RPC invocation, jobSubmitClient is a proxy
created in the initialization

32 Dr Gnanasekaran Thangavel 09/20/2022

Job initialization on JobTracker - 1

JobTracker.submitJob(jobID) <-- receive RPC invocation request

JobInProgress job = new JobInProgress(jobId, this, this.conf)
Add the job into Job Queue
jobs.put(job.getProfile().getJobId(), job);
jobsByPriority.add(job);
jobInitQueue.add(job);

33 Dr Gnanasekaran Thangavel 09/20/2022

Job initialization on JobTracker - 2

Sort by priority
resortPriority();
compare the JobPrioity first, then compare the JobSubmissionTime
Wake JobInitThread
jobInitQueue.notifyall();
job = jobInitQueue.remove(0);
job.initTasks();

34 Dr Gnanasekaran Thangavel 09/20/2022

JobTracker Task Scheduling - 1

Task getNewTaskForTaskTracker(String taskTracker)

Compute the maximum tasks that can be running on
taskTracker
int maxCurrentMap Tasks = tts.getMaxMapTasks();
int maxMapLoad = Math.min(maxCurrentMapTasks,
(int)Math.ceil(double)
remainingMapLoad/numTaskTrackers));

35 Dr Gnanasekaran Thangavel 09/20/2022

JobTracker Task Scheduling - 2

int numMaps = tts.countMapTasks(); // running tasks

number
If numMaps < maxMapLoad, then more tasks can be
allocated, then based on priority, pick the first job from
the jobsByPriority Queue, create a task, and return to
TaskTracker
Task t = job.obtainNewMapTask(tts, numTaskTrackers);

36 Dr Gnanasekaran Thangavel 09/20/2022

Start TaskTracker - 1

initialize()
Remove original local directory
RPC initialization
 TaskReportServer = RPC.getServer(this, bindAddress, tmpPort, max,
false, this, fConf);
 InterTrackerProtocol jobClient = (InterTrackerProtocol)
RPC.waitForProxy(InterTrackerProtocol.class,
InterTrackerProtocol.versionID, jobTrackAddr, this.fConf);

37 Dr Gnanasekaran Thangavel 09/20/2022

Run Task on TaskTracker - 1

TaskTracker.localizeJob(TaskInProgress tip);
launchTasksForJob(tip, new JobConf(rjob.jobFile));
tip.launchTask(); // TaskTracker.TaskInProgress
tip.localizeTask(task); // create folder, symbol link
runner = task.createRunner(TaskTracker.this);
runner.start(); // start TaskRunner thread

38 Dr Gnanasekaran Thangavel 09/20/2022

Run Task on TaskTracker - 2

TaskRunner.run();
Configure child process’ jvm parameters, i.e. classpath,
taskid, taskReportServer’s address & port
Start Child Process
 runChild(wrappedCommand, workDir, taskid);

39 Dr Gnanasekaran Thangavel 09/20/2022

Child.main()

Create RPC Proxy, and execute RPC invocation

TaskUmbilicalProtocol umbilical = (TaskUmbilicalProtocol)
RPC.getProxy(TaskUmbilicalProtocol.class,
TaskUmbilicalProtocol.versionID, address, defaultConf);
Task task = umbilical.getTask(taskid);
task.run(); // mapTask / reduceTask.run

40 Dr Gnanasekaran Thangavel 09/20/2022

Finish Job - 1

Child
task.done(umilical);
 RPC call: umbilical.done(taskId, shouldBePromoted)

TaskTracker
done(taskId, shouldPromote)
 TaskInProgress tip = tasks.get(taskid);
 tip.reportDone(shouldPromote);
 taskStatus.setRunState(TaskStatus.State.SUCCEEDED)

41 Dr Gnanasekaran Thangavel 09/20/2022

Finish Job - 2

JobTracker
TaskStatus report: status.getTaskReports();
TaskInProgress tip = taskidToTIPMap.get(taskId);
JobInProgress update JobStatus
 tip.getJob().updateTaskStatus(tip, report, myMetrics);
 One task of current job is finished
 completedTask(tip, taskStatus, metrics);
 If (this.status.getRunState() == JobStatus.RUNNING && allDone)
{this.status.setRunState(JobStatus.SUCCEEDED)}

42 Dr Gnanasekaran Thangavel 09/20/2022

Demo

Word Count
hadoop jar hadoop-0.20.2-examples.jar wordcount <input
dir> <output dir>
Hive
hive -f pagerank.hive

43 Dr Gnanasekaran Thangavel 09/20/2022

References
1. Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing:
Clusters, Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman
Publisher, an Imprint of Elsevier, 2012.
2. www.csee.usf.edu/~anda/CIS6930-S11/notes/hadoop.ppt
3. www.ics.uci.edu/~cs237/lectures/cloudvirtualization/Hadoop.pptx

44 Dr Gnanasekaran Thangavel 09/20/2022

Other presentations
http://www.slideshare.net/drgst/presentations

45 Dr Gnanasekaran Thangavel 09/20/2022

Thank You

Questions and Comments?

46 Dr Gnanasekaran Thangavel 09/20/2022

KUKA User Defined Messages
No ratings yet
KUKA User Defined Messages
37 pages
Programming Backend with Go
From Everand
Programming Backend with Go
Julian Braun
No ratings yet
Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Queue Program in C
No ratings yet
Queue Program in C
3 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
18 pages
GNCC 6
No ratings yet
GNCC 6
54 pages
Unit Iv Grid Middleware Packages: Globus Toolkit (GT4) Architecture
No ratings yet
Unit Iv Grid Middleware Packages: Globus Toolkit (GT4) Architecture
19 pages
Globus Toolkit
No ratings yet
Globus Toolkit
8 pages
Unit Iv GCC
No ratings yet
Unit Iv GCC
13 pages
GTKSharp Programming Guide: Definitive Reference for Developers and Engineers
From Everand
GTKSharp Programming Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Grid Computing Middle Ware
No ratings yet
Grid Computing Middle Ware
49 pages
Network Coding and Signcryption for Cloud Data Integrity
From Everand
Network Coding and Signcryption for Cloud Data Integrity
Noah Joan
No ratings yet
Milestone 2: Contributions, Not Just What The Paper Is
No ratings yet
Milestone 2: Contributions, Not Just What The Paper Is
58 pages
JFrog Solutions in Modern DevOps: Definitive Reference for Developers and Engineers
From Everand
JFrog Solutions in Modern DevOps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
From Everand
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jupyter Environments and Workflows: Definitive Reference for Developers and Engineers
From Everand
Jupyter Environments and Workflows: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
From Everand
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Networking Programming with C++: Build Efficient Communication Systems
From Everand
Networking Programming with C++: Build Efficient Communication Systems
Robert Johnson
No ratings yet
Deploying Cloud Applications with Heroku: Definitive Reference for Developers and Engineers
From Everand
Deploying Cloud Applications with Heroku: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Grid Middleware
No ratings yet
Grid Middleware
8 pages
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Truffle for Blockchain Development: Definitive Reference for Developers and Engineers
From Everand
Truffle for Blockchain Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
A Review of Various Grid Middleware Technologies
No ratings yet
A Review of Various Grid Middleware Technologies
6 pages
AWS Greengrass for Edge Computing Solutions: Definitive Reference for Developers and Engineers
From Everand
AWS Greengrass for Edge Computing Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Professional Heroku Programming
From Everand
Professional Heroku Programming
Chris Kemp
4/5 (2)
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Dataiku Platform Foundations: Definitive Reference for Developers and Engineers
From Everand
Dataiku Platform Foundations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
From Everand
DeepSeek vs. ChatGPT – Why DeepSeek is the Superior AI.
Gary Thatcher
No ratings yet
Octopus Deploy in Modern CI/CD Workflows: Definitive Reference for Developers and Engineers
From Everand
Octopus Deploy in Modern CI/CD Workflows: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction To Grid Computing
No ratings yet
Introduction To Grid Computing
57 pages
Mainflux Architecture and Implementation Guide: Definitive Reference for Developers and Engineers
From Everand
Mainflux Architecture and Implementation Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical HTCondor Administration: Definitive Reference for Developers and Engineers
From Everand
Practical HTCondor Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
WorldWind Development Essentials: Definitive Reference for Developers and Engineers
From Everand
WorldWind Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
From Everand
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
From Everand
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
Mario Marinov
No ratings yet
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
From Everand
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
Editor IJSMI
No ratings yet
Programming NodeMCU for IoT Applications: Definitive Reference for Developers and Engineers
From Everand
Programming NodeMCU for IoT Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Art of Cloud Computing with Google Cloud Platform: Unraveling the Secrets of Experts
From Everand
Mastering the Art of Cloud Computing with Google Cloud Platform: Unraveling the Secrets of Experts
Steve Jones
No ratings yet
Unit II
No ratings yet
Unit II
37 pages
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
From Everand
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Globus Toolkit 4
No ratings yet
Globus Toolkit 4
17 pages
Programming Backend with Go: Build robust and scalable backends for your applications using the efficient and powerful tools of the Go ecosystem
From Everand
Programming Backend with Go: Build robust and scalable backends for your applications using the efficient and powerful tools of the Go ecosystem
Julian Braun
No ratings yet
Introduction To Grid Computing ,: Networked Computers
No ratings yet
Introduction To Grid Computing ,: Networked Computers
18 pages
2 .Study of Globus Toolkit
100% (1)
2 .Study of Globus Toolkit
6 pages
Kaa IoT Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Kaa IoT Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
From Everand
Dataproc Administration and Engineering Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Google Kubernetes Engine Essentials: Definitive Reference for Developers and Engineers
From Everand
Google Kubernetes Engine Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
FLTK Programming Essentials: Definitive Reference for Developers and Engineers
From Everand
FLTK Programming Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Node-RED Essentials: Definitive Reference for Developers and Engineers
From Everand
Node-RED Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Art of Network Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Network Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
OpenHAB Solutions and Integration: Definitive Reference for Developers and Engineers
From Everand
OpenHAB Solutions and Integration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Grid Computing
No ratings yet
Grid Computing
2 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Graylog Administration and Log Management: Definitive Reference for Developers and Engineers
From Everand
Graylog Administration and Log Management: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DataDog Operations and Monitoring Guide: Definitive Reference for Developers and Engineers
From Everand
DataDog Operations and Monitoring Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Grid Computing: Faisal N. Abu-Khzam & Michael A. Langston
No ratings yet
Grid Computing: Faisal N. Abu-Khzam & Michael A. Langston
89 pages
GASNet Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
GASNet Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 2
No ratings yet
Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 2
33 pages
Cloud Computing Lab Manual
No ratings yet
Cloud Computing Lab Manual
64 pages
Unit 3 PDF
No ratings yet
Unit 3 PDF
16 pages
STT 6" - I ('I,: 5/ GC ..T-T
No ratings yet
STT 6" - I ('I,: 5/ GC ..T-T
14 pages
UC Corner Modify License MAC
No ratings yet
UC Corner Modify License MAC
4 pages
Layouts: Presented By: Ms - Leesha Aneja
No ratings yet
Layouts: Presented By: Ms - Leesha Aneja
21 pages
Windows Adobe Photoshop CS6 Extended 13
0% (1)
Windows Adobe Photoshop CS6 Extended 13
6 pages
Fish Tails Fish Tails Worksheet #6
No ratings yet
Fish Tails Fish Tails Worksheet #6
1 page
Installing, Configuring and Troubleshooting Cisco Unified Communication Manager
No ratings yet
Installing, Configuring and Troubleshooting Cisco Unified Communication Manager
6 pages
Hungry Crabs Hungry Crabs Worksheet #7
No ratings yet
Hungry Crabs Hungry Crabs Worksheet #7
1 page
Computer Networks and Information Security
No ratings yet
Computer Networks and Information Security
35 pages
Hadoop Notes - 134
No ratings yet
Hadoop Notes - 134
10 pages
Time table
No ratings yet
Time table
19 pages
Proposed Kass Food Product PDF 3
No ratings yet
Proposed Kass Food Product PDF 3
2 pages
Python Notes
No ratings yet
Python Notes
19 pages
Florida Cybersecurity Standards (FCS) 60GG-2, F.A.C. FCS Risk Assessment Tool (v2.2)
No ratings yet
Florida Cybersecurity Standards (FCS) 60GG-2, F.A.C. FCS Risk Assessment Tool (v2.2)
129 pages
Module 9 Homework
No ratings yet
Module 9 Homework
5 pages
Module 5 Metadata Schema Assignment Template
No ratings yet
Module 5 Metadata Schema Assignment Template
12 pages
KAG Sacco
No ratings yet
KAG Sacco
15 pages
Echo Off
No ratings yet
Echo Off
1 page
Photoshop CS5 Basics
No ratings yet
Photoshop CS5 Basics
26 pages
Self-Service Password Reset (SSPR) Quick Start Guide Managing My Password
No ratings yet
Self-Service Password Reset (SSPR) Quick Start Guide Managing My Password
19 pages
NWTC Programmers Handbook (1)
No ratings yet
NWTC Programmers Handbook (1)
98 pages
Planning Engineers Interview Questions
100% (2)
Planning Engineers Interview Questions
42 pages
Synopsis
No ratings yet
Synopsis
30 pages
Programming C & C++ Record
No ratings yet
Programming C & C++ Record
56 pages
4 Es6 Js Object Json
No ratings yet
4 Es6 Js Object Json
40 pages
AppInstructions 2015
No ratings yet
AppInstructions 2015
4 pages
MATLAB Tutorial - CCN Course 2012: How To Code A Neural Network Simulation
No ratings yet
MATLAB Tutorial - CCN Course 2012: How To Code A Neural Network Simulation
122 pages
Digital Communicationchapterprepress
No ratings yet
Digital Communicationchapterprepress
14 pages
Audit Softwares 1. Teammate + Audit
No ratings yet
Audit Softwares 1. Teammate + Audit
5 pages
FE Kill Bring Attach V2 Gui - TXTSC
No ratings yet
FE Kill Bring Attach V2 Gui - TXTSC
4 pages
ME160 FinalQuiz
No ratings yet
ME160 FinalQuiz
6 pages
Seat Plan
No ratings yet
Seat Plan
7 pages
Object Oriented Programming: Dr. R Lawrance Director / MCA / ANJAC
No ratings yet
Object Oriented Programming: Dr. R Lawrance Director / MCA / ANJAC
77 pages
Qdoc - Tips Internal Combustion Engine by Mathur and Sharma PD
No ratings yet
Qdoc - Tips Internal Combustion Engine by Mathur and Sharma PD
2 pages
Java Unit-Iii
No ratings yet
Java Unit-Iii
46 pages
Hospital Management System SRS
No ratings yet
Hospital Management System SRS
2 pages
Enhance Duplicate Supplier Check On Taxpayer Id
No ratings yet
Enhance Duplicate Supplier Check On Taxpayer Id
7 pages

Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 4

Uploaded by

Dokumen - Tips Cs6703 Grid and Cloud Computing Unit 4

Uploaded by

CS6703 GRID AND CLOUD COMPUTING

2 Dr Gnanasekaran Thangavel 09/20/2022

5. Condor-G - Originally developed at the Univ. of Wisconsin for general

5 Dr Gnanasekaran Thangavel 09/20/2022

6 Dr Gnanasekaran Thangavel 09/20/2022

8 Dr Gnanasekaran Thangavel 09/20/2022

9 Dr Gnanasekaran Thangavel 09/20/2022

10 Dr Gnanasekaran Thangavel 09/20/2022

11 Dr Gnanasekaran Thangavel 09/20/2022

• an open-source software framework that supports data-intensive distributed applications,

• High scalability and availability

• Use commodity (cheap!) hardware with little redundancy

14 •Dr Gnanasekaran Thangavel rather than data

15 Dr Gnanasekaran Thangavel 09/20/2022

Written in Java, also supports Python and Ruby

16 Dr Gnanasekaran Thangavel 09/20/2022

19 Dr Gnanasekaran Thangavel 09/20/2022

20 Dr Gnanasekaran Thangavel 09/20/2022

Some examples of scale:

21 Dr Gnanasekaran Thangavel 09/20/2022

Searches (group related documents)

Security (search for uncommon patterns)

22 Dr Gnanasekaran Thangavel 09/20/2022

24 Dr Gnanasekaran Thangavel 09/20/2022

25 Dr Gnanasekaran Thangavel 09/20/2022

Use Checksums to validate data – CRC32

26 Dr Gnanasekaran Thangavel 09/20/2022

Client retrieves a list of DataNodes on which to place

27 Dr Gnanasekaran Thangavel 09/20/2022

28 Dr Gnanasekaran Thangavel 09/20/2022

29 Dr Gnanasekaran Thangavel 09/20/2022

JobClient <-------------> JobTracker

Check input and output, e.g. check if the output directory

31 Dr Gnanasekaran Thangavel 09/20/2022

The jar file and configuration file will be uploaded to

32 Dr Gnanasekaran Thangavel 09/20/2022

JobTracker.submitJob(jobID) <-- receive RPC invocation request

33 Dr Gnanasekaran Thangavel 09/20/2022

34 Dr Gnanasekaran Thangavel 09/20/2022

Task getNewTaskForTaskTracker(String taskTracker)

35 Dr Gnanasekaran Thangavel 09/20/2022

int numMaps = tts.countMapTasks(); // running tasks

36 Dr Gnanasekaran Thangavel 09/20/2022

37 Dr Gnanasekaran Thangavel 09/20/2022

38 Dr Gnanasekaran Thangavel 09/20/2022

39 Dr Gnanasekaran Thangavel 09/20/2022

Create RPC Proxy, and execute RPC invocation

40 Dr Gnanasekaran Thangavel 09/20/2022

41 Dr Gnanasekaran Thangavel 09/20/2022

42 Dr Gnanasekaran Thangavel 09/20/2022

43 Dr Gnanasekaran Thangavel 09/20/2022

44 Dr Gnanasekaran Thangavel 09/20/2022

45 Dr Gnanasekaran Thangavel 09/20/2022

Questions and Comments?

46 Dr Gnanasekaran Thangavel 09/20/2022

You might also like