Unit Iv GCC
Unit Iv GCC
Figure: Globus Toolkit GT4 supports distributed and cluster computing services
Client-Globus Interactions
There are strong interactions between provider programs and user code. GT4 makes
heavy use of industry-standard web service protocols and mechanisms in service
Description, discovery, access, authentication, authorization, and the like. GT4 makes extensive
use of Java, C, and Python to write user code. Web service mechanisms define
specific interfaces for grid computing. Web services provide flexible, extensible, and widely
adopted XML-based interfaces.
These demand computational, communication, data, and storage resources. We must
enable a range of end-user tools that provide the higher-level capabilities needed in specific user
applications. Developers can use these services and libraries to build simple and complex
systems quickly.
Figure: Client and GT4 server interactions; vertical boxes correspond to service programs and
horizontal boxes represent the user codes.
The horizontal boxes in the client domain denote custom applications and/or third-party
tools that access GT4 services. The toolkit programs provide a set of useful infrastructure
services.
Three containers are used to host user-developed services written in Java, Python, and C,
respectively. These containers provide implementations of security, management, discovery,
state management, and other mechanisms frequently required when building services.
The Hadoop Distributed File System (HDFS)MapReduce environment provides the user with a
sophisticated framework to manage the execution of map and reduce tasks across a cluster of
machines.
The user is required to tell the framework the following:
• The location(s) in the distributed file system of the job input
• The location(s) in the distributed file system for the job output
• The input format
• The output format
• The class containing the map function
• Optionally. the class containing the reduce function
• The JAR file(s) containing the map and reduce functions and any support classes
The final output will be moved to the output directory, and the job status will be reported
to the user.MapReduce is oriented around key/value pairs. The framework will convert each
record of input into a key/value pair, and each pair will be input to the map function once. The
map output is a set of key/value pairs—nominally one pair that is the transformed input pair. The
map output pairs are grouped and sorted by key. The reduce function is called one time for each
key, in sort sequence, with the key and the set of values that share that key. The reduce method
may output an arbitrary number of key/value pairs, which are written to the output files in the job
output directory. If the reduce output keys are unchanged from the reduce input keys, the final
output will be sorted. The framework provides two processes that handle the management of
MapReduce jobs:
• TaskTracker manages the execution of individual map and reduce tasks on a compute
node in the cluster.
• JobTracker accepts job submissions, provides job monitoring and control, and manages
the distribution of tasks to the TaskTracker nodes.
The JobTracker is a single point of failure, and the JobTracker will work around the
failure of individual TaskTracker processes.
The Hadoop Distributed File System
HDFS is a file system that is designed for use for MapReduce jobs that read input in large
chunks of input, process it, and write potentially large chunks of output. HDFS does not handle
random access particularly well. For reliability, file data is simply mirrored to multiple storage
nodes. This is referred to as replication in the Hadoop community. As long as at least one replica
of a data chunk is available, the consumer of that data will not know of storage server failures.
HDFS services are provided by two processes:
• NameNode handles management of the file system metadata, and provides management and
control services.
• DataNode provides block storage and retrieval services.
There will be one NameNode process in an HDFS file system, and this is a single point of
failure. Hadoop Core provides recovery and automatic backup of the NameNode, but no hot
failover services. There will be multiple DataNode processes within the cluster, with typically
one DataNode process per storage node in a cluster.
IdentityMapper.java
package org.apache.hadoop.mapred.lib;
import java.io.IOException;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.MapReduceBase;
/** Implements the identity function, mapping inputs directly to outputs. */
public class IdentityMapper<K, V>
extends MapReduceBase implements Mapper<K, V, K, V> {
/** The identify function. Input key/value pair is written directly to
* output.*/
public void map(K key, V val,
OutputCollector<K, V> output, Reporter reporter)
throws IOException {
output.collect(key, val);
}
}
A Simple Reduce Function: IdentityReducer
The Hadoop framework calls the reduce function one time for each unique key. The framework
provides the key and the set of values that share that key.
IdentityReducer.java
package org.apache.hadoop.mapred.lib;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.MapReduceBase;
/** Performs no reduction, writing all input values directly to the output. */
public class IdentityReducer<K, V>
extends MapReduceBase implements Reducer<K, V, K, V> {
Chapter 2 ■ THE BASICS OF A MAPREDUCE JOB 35
/** Writes all keys and values directly to output. */
public void reduce(K key, Iterator<V> values,
OutputCollector<K, V> output, Reporter reporter)
throws IOException {
while (values.hasNext()) {
output.collect(key, values.next());
}
}
If you require the output of your job to be sorted, the reducer function must pass the key objects
to the output.collect() method unchanged. The reduce phase is, however, free to output any
number of records, including zero records, with the same key and different values.
The client then calls read () on the stream. DFSInputStream, which has stored the
datanode addresses for the first few blocks in the file, then connects to the first (closest) datanode
for the first block in the file. Data is streamed from the datanode back to the client, which calls
read () repeatedly on the stream. When the end of the block is reached, DFSInputStream will
close the connection to the datanode, then find the best datanode for the next block. This happens
transparently to the client, which from its point of view is just reading a continuous stream.
Blocks are read in order with the DFSInputStream opening new connections to datanodes
as the client reads through the stream. It will also call the namenode to retrieve the datanode
locations for the next batch of blocks as needed. When the client has finished reading, it calls
close () on the FSDataInputStream .
Figure: Network distance in Hadoop
During reading, if the DFSInputStream encounters an error while communicating with a
datanode, then it will try the next closest one for that block. It will also remember datanodes that
have failed so that it doesn’t needlessly retry them for later blocks. The DFSInputStream also
verifies checksums for the data transferred to it from the datanode.
If a corrupted block is found, it is reported to the namenode before the DFSInput Stream
attempts to read a replica of the block from another datanode.One important aspect of this design
is that the client contacts datanodes directly to retrieve data and is guided by the namenode to the
best datanode for each block. This design allows HDFS to scale to a large number of concurrent
clients, since the data traffic is spread across all the datanodes in the cluster.
The DistributedFileSystem returns an FSDataOutputStream for the client to start writing data to.
Just as in the read case, FSDataOutputStream wraps a DFSOutput Stream, which handles
communication with the datanodes and namenode.As the client writes data (step 3),
DFSOutputStream splits it into packets, which it writes to an internal queue, called the data
queue. The data queue is consumed by the Data Streamer, whose responsibility it is to ask the
namenode to allocate new blocks by picking a list of suitable datanodes to store the replicas. The
list of datanodes forms a pipeline—we’ll assume the replication level is three, so there are three
nodes in the pipeline. The DataStreamer streams the packets to the first datanode in the pipeline,
which stores the packet and forwards it to the second datanode in the pipeline. Similarly, the
second datanode stores the packet and forwards it to the third (and last) datanode in the pipeline
(step 4).DFSOutputStream also maintains an internal queue of packets that are waiting to be
acknowledged by datanodes, called the ack queue. A packet is removed from the ack queue only
when it has been acknowledged by all the datanodes in the pipeline (step 5).If a datanode fails
while data is being written to it, then the following actions are taken, which are transparent to the
client writing the data.
Figure: A typical replica pipeline
First the pipeline is closed, and any packets in the ack queue are added to the front of the
data queue so that datanodes that are downstream from the failed node will not miss any packets.
The current block 0on the good datanodes is given a new identity, which is communicated to the
namenode, so that the partial block on the failed datanode will be deleted if the failed. Datanode
recovers later on. The failed datanode is removed from the pipeline and the remainder of the
block’s data is written to the two good datanodes in the pipeline. The namenode notices that the
block is under-replicated, and it arranges for a further replica to be created on another node.
Subsequent blocks are then treated as normal. It’s possible, but unlikely, that multiple datanodes
fail while a block is being written. As long as dfs.replication.min replicas (default one) are
written, the write will succeed, and the block will be asynchronously replicated across the cluster
until its target replication factor is reached.
When the client has finished writing data, it calls close () on the stream (step 6). This
action flushes all the remaining packets to the datanode pipeline and waits for acknowledgments
before contacting the namenode to signal that the file is complete (step 7). The namenode already
knows which blocks the file is made up.