SlideShare a Scribd company logo
Hadoop Futures
Hadoop Futures
What to watch

Tom White, Cloudera
Hadoop User Group UK, Bristol
10 August 2009
About me
▪   Apache Hadoop Committer, PMC
    Member, Apache Member
▪   Employed by Cloudera
▪   Author of “Hadoop: The Definitive
    Guide”
    ▪   http://hadoopbook.com
Goals
▪   Modular
    ▪   E.g. pluggable block placement algorithm
▪   Multiple languages
    ▪   E.g. not just Java for MapReduce
▪   Integration with other systems
    ▪   E.g. JMX monitoring hooks
The Project Split
▪   Core -> Common, HDFS, MapReduce
▪   New repositories
▪   New mailing lists
    ▪   {common,hdfs,mapreduce}-{user,dev,issues}@hadoop.apache.org
▪   New directory layouts
▪   New configuration
    ▪   hadoop-site.xml -> {core,hdfs,mapreduce}-site.xml
▪   More information at
    ▪   http://www.cloudera.com/blog/2009/07/17/the-project-split/
    ▪   general@hadoop.apache.org
Releases
▪   0.18.3 - 29 Jan 2009
    ▪   Official “stable” release
    ▪   Probably the most commonly used
    ▪   Basis for first Cloudera distribution
▪   0.19.2 - 23 July 2009
    ▪   0.19 series is not widely used
▪   0.20.0 - 22 April 2009
    ▪   Expect large adoption with 0.20.1 release in coming weeks
    ▪   Basis for second Cloudera distribution, first Yahoo! distribution
▪   0.21 series - feature freeze end of August 2009
Hadoop 1.0
▪   After 0.21 release
▪   Need to establish rules about version evolution
    ▪   Hadoop 1.0 Interface Classification - HADOOP-5073
    ▪   API, Data, wire protocol compatibility - HADOOP-5071
Interesting Projects/JIRAs
▪   Common
    ▪   Avro for Hadoop RPC - HADOOP-6170
    ▪   Service lifecycle - HDFS-326
    ▪   Distributed configuration - HADOOP-5670
    ▪   10 minute patch builds - HADOOP-5628, HDFS-458,
        MAPREDUCE-670
    ▪   Ivy/Maven integration - HADOOP-5107
    ▪   Eclipse plugin
Interesting Projects/JIRAs (continued)
▪   MapReduce
    ▪   Metadata in Serialization - HADOOP-6165
    ▪   Compute splits on the cluster - MAPREDUCE-207
    ▪   Context Objects - ongoing migration of libraries/examples
    ▪   Security - HADOOP-4487
    ▪   Schedulers
        ▪   Fair share scheduler - global scheduling, FIFO - MAPREDUCE-548,
            MAPREDUCE-706
        ▪   Capacity - high RAM jobs - HADOOP-5884
    ▪   Speed: new shuffle
        ▪   See http://sortbenchmark.org/Yahoo2009.pdf
Popular JIRAs
▪   http://community.cloudera.com/
Questions?
▪   tom@cloudera.com


▪   Cloudera’s Distribution for Hadoop
    ▪   http://www.cloudera.com/hadoop
(c) 2009 Cloudera, Inc. or its licensors.  "Cloudera" is a registered trademark of Cloudera, Inc.. All rights reserved. 1.0

More Related Content

What's hot (20)

Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Mark Kerzner
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Alluxio, Inc.
 
Memcache and Drupal - Vaibhav Jain
Memcache and Drupal - Vaibhav JainMemcache and Drupal - Vaibhav Jain
Memcache and Drupal - Vaibhav Jain
Drupal Camp Delhi
 
Ceph meetup montreal
Ceph meetup montrealCeph meetup montreal
Ceph meetup montreal
Antoine Bayard
 
Working with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsWorking with the Moodle Database: The Basics
Working with the Moodle Database: The Basics
Severalnines
 
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Community
 
Breaking performance web rules
Breaking performance web rulesBreaking performance web rules
Breaking performance web rules
Pablo Moretti
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
openATTIC Technology Overview - Ceph Management
openATTIC Technology Overview - Ceph ManagementopenATTIC Technology Overview - Ceph Management
openATTIC Technology Overview - Ceph Management
it-novum
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time application
Edward Yoon
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
HBaseCon
 
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSHadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Erik Krogen
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
Gruter
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Spark Summit
 
Introduction to CloudStack Storage Subsystem
Introduction to CloudStack Storage SubsystemIntroduction to CloudStack Storage Subsystem
Introduction to CloudStack Storage Subsystem
buildacloud
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
HBaseCon
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
HBaseCon
 
GlusterFS And Big Data
GlusterFS And Big DataGlusterFS And Big Data
GlusterFS And Big Data
Lalatendu Mohanty
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStack
EDB
 
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Mark Kerzner
 
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Alluxio, Inc.
 
Memcache and Drupal - Vaibhav Jain
Memcache and Drupal - Vaibhav JainMemcache and Drupal - Vaibhav Jain
Memcache and Drupal - Vaibhav Jain
Drupal Camp Delhi
 
Working with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsWorking with the Moodle Database: The Basics
Working with the Moodle Database: The Basics
Severalnines
 
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Day Santa Clara: Ceph and Apache CloudStack
Ceph Community
 
Breaking performance web rules
Breaking performance web rulesBreaking performance web rules
Breaking performance web rules
Pablo Moretti
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Amir Sedighi
 
openATTIC Technology Overview - Ceph Management
openATTIC Technology Overview - Ceph ManagementopenATTIC Technology Overview - Ceph Management
openATTIC Technology Overview - Ceph Management
it-novum
 
Usage case of HBase for real-time application
Usage case of HBase for real-time applicationUsage case of HBase for real-time application
Usage case of HBase for real-time application
Edward Yoon
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
HBaseCon
 
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSHadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Erik Krogen
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
Gruter
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Spark Summit
 
Introduction to CloudStack Storage Subsystem
Introduction to CloudStack Storage SubsystemIntroduction to CloudStack Storage Subsystem
Introduction to CloudStack Storage Subsystem
buildacloud
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
HBaseCon
 
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to ContributeHBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon
 
Digital Library Collection Management using HBase
Digital Library Collection Management using HBaseDigital Library Collection Management using HBase
Digital Library Collection Management using HBase
HBaseCon
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStack
EDB
 

Viewers also liked (19)

Graphs
GraphsGraphs
Graphs
Steve Loughran
 
Hadoop Hackathon Reader
Hadoop Hackathon ReaderHadoop Hackathon Reader
Hadoop Hackathon Reader
Evert Lammerts
 
Large-Scale Data Storage and Processing for Scientists with Hadoop
Large-Scale Data Storage and Processing for Scientists with HadoopLarge-Scale Data Storage and Processing for Scientists with Hadoop
Large-Scale Data Storage and Processing for Scientists with Hadoop
Evert Lammerts
 
A hadoop implementation of pagerank
A hadoop implementation of pagerankA hadoop implementation of pagerank
A hadoop implementation of pagerank
Chengeng Ma
 
Hadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, sonHadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, son
Chengeng Ma
 
Google PageRank
Google PageRankGoogle PageRank
Google PageRank
Beat Signer
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
Marin Dimitrov
 
Big Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hiveBig Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hive
Flavio Fonte, PMP, ITIL
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
Nick Dimiduk
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
Kevin Weil
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Helena Edelson
 
Pagerank Algorithm Explained
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explained
jdhaar
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
rantav
 
Hadoop Hackathon Reader
Hadoop Hackathon ReaderHadoop Hackathon Reader
Hadoop Hackathon Reader
Evert Lammerts
 
Large-Scale Data Storage and Processing for Scientists with Hadoop
Large-Scale Data Storage and Processing for Scientists with HadoopLarge-Scale Data Storage and Processing for Scientists with Hadoop
Large-Scale Data Storage and Processing for Scientists with Hadoop
Evert Lammerts
 
A hadoop implementation of pagerank
A hadoop implementation of pagerankA hadoop implementation of pagerank
A hadoop implementation of pagerank
Chengeng Ma
 
Hadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, sonHadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, son
Chengeng Ma
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
Amund Tveit
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
Large Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part ILarge Scale Data Analysis with Map/Reduce, part I
Large Scale Data Analysis with Map/Reduce, part I
Marin Dimitrov
 
Big Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hiveBig Data - O que é o hadoop, map reduce, hdfs e hive
Big Data - O que é o hadoop, map reduce, hdfs e hive
Flavio Fonte, PMP, ITIL
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
Nick Dimiduk
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
Kevin Weil
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Helena Edelson
 
Pagerank Algorithm Explained
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explained
jdhaar
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
rantav
 

Similar to Hadoop Futures (20)

Webinar: The Future of Hadoop
Webinar: The Future of HadoopWebinar: The Future of Hadoop
Webinar: The Future of Hadoop
Cloudera, Inc.
 
Hadoop pycon2011uk
Hadoop pycon2011ukHadoop pycon2011uk
Hadoop pycon2011uk
Aditya Sakhuja
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Hadoop Conference Japan 2011 Fallに行ってきました
Hadoop Conference Japan 2011 Fallに行ってきましたHadoop Conference Japan 2011 Fallに行ってきました
Hadoop Conference Japan 2011 Fallに行ってきました
moai kids
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Hadoop and OpenStack - Hadoop Summit San Jose 2014
Hadoop and OpenStack - Hadoop Summit San Jose 2014Hadoop and OpenStack - Hadoop Summit San Jose 2014
Hadoop and OpenStack - Hadoop Summit San Jose 2014
spinningmatt
 
Hadoop and OpenStack
Hadoop and OpenStackHadoop and OpenStack
Hadoop and OpenStack
DataWorks Summit
 
DC HUG Hadoop for Windows
DC HUG Hadoop for WindowsDC HUG Hadoop for Windows
DC HUG Hadoop for Windows
Terry Padgett
 
Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014
spinningmatt
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platform
nvvrajesh
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Pentaho
 
Hadoop operations
Hadoop operationsHadoop operations
Hadoop operations
Marc Cluet
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop World
Cloudera, Inc.
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
DataWorks Summit
 
Webinar: The Future of Hadoop
Webinar: The Future of HadoopWebinar: The Future of Hadoop
Webinar: The Future of Hadoop
Cloudera, Inc.
 
Hadoop Conference Japan 2011 Fallに行ってきました
Hadoop Conference Japan 2011 Fallに行ってきましたHadoop Conference Japan 2011 Fallに行ってきました
Hadoop Conference Japan 2011 Fallに行ってきました
moai kids
 
Hadoop and OpenStack - Hadoop Summit San Jose 2014
Hadoop and OpenStack - Hadoop Summit San Jose 2014Hadoop and OpenStack - Hadoop Summit San Jose 2014
Hadoop and OpenStack - Hadoop Summit San Jose 2014
spinningmatt
 
DC HUG Hadoop for Windows
DC HUG Hadoop for WindowsDC HUG Hadoop for Windows
DC HUG Hadoop for Windows
Terry Padgett
 
Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014Hadoop on OpenStack - Sahara @DevNation 2014
Hadoop on OpenStack - Sahara @DevNation 2014
spinningmatt
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
HdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft PlatformHdInsight essentials Hadoop on Microsoft Platform
HdInsight essentials Hadoop on Microsoft Platform
nvvrajesh
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Pentaho
 
Hadoop operations
Hadoop operationsHadoop operations
Hadoop operations
Marc Cluet
 
Hw09 Welcome To Hadoop World
Hw09   Welcome To Hadoop WorldHw09   Welcome To Hadoop World
Hw09 Welcome To Hadoop World
Cloudera, Inc.
 
Optimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for HadoopOptimizing Dell PowerEdge Configurations for Hadoop
Optimizing Dell PowerEdge Configurations for Hadoop
Mike Pittaro
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
tcloudcomputing-tw
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
DataWorks Summit
 

More from Steve Loughran (20)

Hadoop Vectored IO
Hadoop Vectored IOHadoop Vectored IO
Hadoop Vectored IO
Steve Loughran
 
The age of rename() is over
The age of rename() is overThe age of rename() is over
The age of rename() is over
Steve Loughran
 
What does Rename Do: (detailed version)
What does Rename Do: (detailed version)What does Rename Do: (detailed version)
What does Rename Do: (detailed version)
Steve Loughran
 
Put is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit EditionPut is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit Edition
Steve Loughran
 
@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!
Steve Loughran
 
PUT is the new rename()
PUT is the new rename()PUT is the new rename()
PUT is the new rename()
Steve Loughran
 
Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming Deployed
Steve Loughran
 
Testing
TestingTesting
Testing
Steve Loughran
 
I hate mocking
I hate mockingI hate mocking
I hate mocking
Steve Loughran
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?
Steve Loughran
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Steve Loughran
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User Group
Steve Loughran
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object stores
Steve Loughran
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
Steve Loughran
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
Steve Loughran
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony Era
Steve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Steve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
Steve Loughran
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
Steve Loughran
 
YARN Services
YARN ServicesYARN Services
YARN Services
Steve Loughran
 
The age of rename() is over
The age of rename() is overThe age of rename() is over
The age of rename() is over
Steve Loughran
 
What does Rename Do: (detailed version)
What does Rename Do: (detailed version)What does Rename Do: (detailed version)
What does Rename Do: (detailed version)
Steve Loughran
 
Put is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit EditionPut is the new rename: San Jose Summit Edition
Put is the new rename: San Jose Summit Edition
Steve Loughran
 
@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!@Dissidentbot: dissent will be automated!
@Dissidentbot: dissent will be automated!
Steve Loughran
 
PUT is the new rename()
PUT is the new rename()PUT is the new rename()
PUT is the new rename()
Steve Loughran
 
Extreme Programming Deployed
Extreme Programming DeployedExtreme Programming Deployed
Extreme Programming Deployed
Steve Loughran
 
What does rename() do?
What does rename() do?What does rename() do?
What does rename() do?
Steve Loughran
 
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and HiveDancing Elephants: Working with Object Storage in Apache Spark and Hive
Dancing Elephants: Working with Object Storage in Apache Spark and Hive
Steve Loughran
 
Apache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User GroupApache Spark and Object Stores —for London Spark User Group
Apache Spark and Object Stores —for London Spark User Group
Steve Loughran
 
Spark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object storesSpark Summit East 2017: Apache spark and object stores
Spark Summit East 2017: Apache spark and object stores
Steve Loughran
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
Steve Loughran
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
Steve Loughran
 
Household INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony EraHousehold INFOSEC in a Post-Sony Era
Household INFOSEC in a Post-Sony Era
Steve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Steve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
Steve Loughran
 
Slider: Applications on YARN
Slider: Applications on YARNSlider: Applications on YARN
Slider: Applications on YARN
Steve Loughran
 

Recently uploaded (20)

Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 

Hadoop Futures

  • 2. Hadoop Futures What to watch Tom White, Cloudera Hadoop User Group UK, Bristol 10 August 2009
  • 3. About me ▪ Apache Hadoop Committer, PMC Member, Apache Member ▪ Employed by Cloudera ▪ Author of “Hadoop: The Definitive Guide” ▪ http://hadoopbook.com
  • 4. Goals ▪ Modular ▪ E.g. pluggable block placement algorithm ▪ Multiple languages ▪ E.g. not just Java for MapReduce ▪ Integration with other systems ▪ E.g. JMX monitoring hooks
  • 5. The Project Split ▪ Core -> Common, HDFS, MapReduce ▪ New repositories ▪ New mailing lists ▪ {common,hdfs,mapreduce}-{user,dev,issues}@hadoop.apache.org ▪ New directory layouts ▪ New configuration ▪ hadoop-site.xml -> {core,hdfs,mapreduce}-site.xml ▪ More information at ▪ http://www.cloudera.com/blog/2009/07/17/the-project-split/ ▪ [email protected]
  • 6. Releases ▪ 0.18.3 - 29 Jan 2009 ▪ Official “stable” release ▪ Probably the most commonly used ▪ Basis for first Cloudera distribution ▪ 0.19.2 - 23 July 2009 ▪ 0.19 series is not widely used ▪ 0.20.0 - 22 April 2009 ▪ Expect large adoption with 0.20.1 release in coming weeks ▪ Basis for second Cloudera distribution, first Yahoo! distribution ▪ 0.21 series - feature freeze end of August 2009
  • 7. Hadoop 1.0 ▪ After 0.21 release ▪ Need to establish rules about version evolution ▪ Hadoop 1.0 Interface Classification - HADOOP-5073 ▪ API, Data, wire protocol compatibility - HADOOP-5071
  • 8. Interesting Projects/JIRAs ▪ Common ▪ Avro for Hadoop RPC - HADOOP-6170 ▪ Service lifecycle - HDFS-326 ▪ Distributed configuration - HADOOP-5670 ▪ 10 minute patch builds - HADOOP-5628, HDFS-458, MAPREDUCE-670 ▪ Ivy/Maven integration - HADOOP-5107 ▪ Eclipse plugin
  • 9. Interesting Projects/JIRAs (continued) ▪ MapReduce ▪ Metadata in Serialization - HADOOP-6165 ▪ Compute splits on the cluster - MAPREDUCE-207 ▪ Context Objects - ongoing migration of libraries/examples ▪ Security - HADOOP-4487 ▪ Schedulers ▪ Fair share scheduler - global scheduling, FIFO - MAPREDUCE-548, MAPREDUCE-706 ▪ Capacity - high RAM jobs - HADOOP-5884 ▪ Speed: new shuffle ▪ See http://sortbenchmark.org/Yahoo2009.pdf
  • 10. Popular JIRAs ▪ http://community.cloudera.com/
  • 11. Questions? ▪ [email protected] ▪ Cloudera’s Distribution for Hadoop ▪ http://www.cloudera.com/hadoop
  • 12. (c) 2009 Cloudera, Inc. or its licensors.  "Cloudera" is a registered trademark of Cloudera, Inc.. All rights reserved. 1.0