SlideShare a Scribd company logo
MongoDB + Hadoop
Wingman, MongoDB Inc.
Norberto Leite
#bcnmug
Agenda
•  MongoDB
•  Hadoop
•  MongoDB + Hadoop Connector
•  How it works
•  What can we do with it
MongoDB
MongoDB
The leading NoSQL database
Document
Database
Open-
Source
General
Purpose
Better Data
Locality
Performance
In-Memory Caching In-Place
Updates
Scalability
Auto-Sharding
•  Increase capacity as you go
•  Commodity and cloud architectures
•  Improved operational simplicity and cost visibility
High Availability
•  Automated replication and failover
•  Multi-data center support
•  Improved operational simplicity (e.g., HW swaps)
•  Data durability and consistency
Shell
Command-line shell for interacting
directly with database
Shell and Drivers
Drivers
Drivers for most popular
programming languages and
frameworks
> db.collection.insert({product:“MongoDB”,
type:“Document Database”})
>
> db.collection.findOne()
{
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),
“product” : “MongoDB”
“type” : “Document Database”
}
Java
Python
Perl
Ruby
Haskell
JavaScript
Hadoop
Hadoop
•  Google publishes seminal papers
–  GFS – global file system (Oct 2003)
–  MapReduce – divide and conquer (Dec 2004)
–  How they indexed the internet
•  Yahoo builds and open sources (2006)
–  Doug Cutting lead, now at Cloudera
–  Most others now at Hortonworks
•  Commonly mentioned has:
–  The elephant in the room!
Hadoop
•  Primary components
–  HDFS–Hadoop Distributed File System
–  MapReduce–parallel processing engine
•  Ecosystem
–  HIVE
–  HBASE
–  PIG
–  Oozie
–  Sqoop
–  Zookeeper
Barcelona MUG MongoDB + Hadoop Presentation
MongoDB+Hadoop Connector
Barcelona MUG MongoDB + Hadoop Presentation
MongoDB Hadoop Connector
•  http://api.mongodb.org/hadoop/MongoDB
%2BHadoop+Connector.html
•  Allows interoperation of MongoDB and Hadoop
•  “Give power to the people”
•  Allows processing across multiple sources
•  Avoid custom hacky exports and imports
•  Scalability and Flexibility to accommodate Hadoop
and|or MongoDB changes
MongoDB with Hadoop• 
MongoDB
MongoDB with Hadoop• 
MongoDB warehouse
MongoDB with Hadoop
MongoDBETL
Benefits and Features
•  Full multi-core parallelism to process data in
MongoDB
•  Full integration w/ Hadoop and JVM ecosystem
•  Can be used on Amazon Elastic MapReduce
•  Read and write backup files to local,HDFS and S3
•  Vanilla Java MapReduce
•  But not only using Hadoop Streaming
Benefits and Features
•  Supports HIVE•  Supports PIG
How it works
•  Adapter examines MongoDB input collection and
calculates a set of splits from data
•  Each split is assigned to a Hadoop node
•  In parallel hadoop pulls data from splits on
MongoDB (or BSON) and starts processing locally
•  Hadoop merges results and streams output back to
MongoDB (or BSON) output collection
Example Tour
Tour bus stops
•  Java MapReduce w/ MongoDB-Hadoop Connector
•  Using Hadoop Streaming
•  Pig and Hive
Data Set
•  ENRON emails corpus (501 records,1.75GB)
•  Each document is one email
•  https://www.cs.cmu.edu/~enron/
{
"_id" : ObjectId("4f2ad4c4d1e2d3f15a000000"),
"body" : "Here is our forecastnn ",
"filename" : "1.",
"headers" : {
"From" : "phillip.allen@enron.com",
"Subject" : "Forecast Info",
"X-bcc" : "",
"To" : "tim.belden@enron.com",
"X-Origin" : "Allen-P",
"X-From" : "Phillip K Allen",
"Date" : "Mon, 14 May 2001 16:39:00 -0700 (PDT)",
"X-To" : "Tim Belden ",
"Message-ID" : "<18782981.1075855378110.JavaMail.evans@thyme>",
"Content-Type" : "text/plain; charset=us-ascii",
"Mime-Version" : "1.0"
}
}
Document Example
Let’s build a graph between
senders and recipients and
count the messages exchanged
Graph Sketch
{"_id": {"t":"bob@enron.com", "f":"alice@enron.com"}, "count" : 14}
{"_id": {"t":"bob@enron.com", "f":"eve@enron.com"}, "count" : 9}
{"_id": {"t":"alice@enron.com", "f":"charlie@enron.com"}, "count" : 99}
{"_id": {"t":"charlie@enron.com", "f":"bob@enron.com"}, "count" : 48}
{"_id": {"t":"eve@enron.com", "f":"charlie@enron.com"}, "count" : 20}
Receiver Sender Pairs
Vanilla Java MapReduce
@Override
public void map(NullWritable key, BSONObject val, final Context context){
BSONObject headers = (BSONObject)val.get("headers");
if(headers.containsKey("From") && headers.containsKey("To")){
String from = (String)headers.get("From"); String to = (String) headers.get("To");
String[] recips = to.split(",");
for(int i=0;i<recips.length;i++){
String recip = recips[i].trim();
context.write(new MailPair(from, recip), new IntWritable(1));
}
}
}
Map Phase – each document get’s
through mapper function
public void reduce( final MailPair pKey, final Iterable<IntWritable> pValues,
final Context pContext ){
int sum = 0;
for ( final IntWritable value : pValues ){
sum += value.get();
}
BSONObject outDoc = new BasicDBObjectBuilder().start()
.add( "f" , pKey.from)
.add( "t" , pKey.to )
.get();
BSONWritable pkeyOut = new BSONWritable(outDoc);
pContext.write( pkeyOut, new IntWritable(sum) ); }
Reduce Phase – output Maps are
grouped by key and passed to Reducer
mongo.job.input.format=com.mongodb.hadoop.MongoInputFormat
mongo.input.uri=mongodb://my-db:27017/enron.messages
mongo.job.input.format=com.mongodb.hadoop.BSONFileInputFormat
mapred.input.dir= file:///tmp/messages.bson
mapred.input.dir= hdfs:///tmp/messages.bson
mapred.input.dir= s3:///tmp/messages.bson
(or BSON)Read From MongoDB
mongo.job.output.format=com.mongodb.hadoop.MongoOutputFormat
mongo.output.uri=mongodb://my-db:27017/enron.results_out
mongo.job.output.format=com.mongodb.hadoop.BSONFileOutputFormat
mapred.output.dir= file:///tmp/results.bson
mapred.output.dir= hdfs:///tmp/results.bson
mapred.output.dir= s3:///tmp/results.bson
(or BSON)Write To MongoDB
mongos> db.streaming.output.find({"_id.t": /^kenneth.lay/})
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "15126-1267@m2.innovyx.com" }, "count" : 1 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "2586207@www4.imakenews.com" }, "count" : 1 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "40enron@enron.com" }, "count" : 2 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "a..davis@enron.com" }, "count" : 2 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "a..hughes@enron.com" }, "count" : 4 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "a..lindholm@enron.com" }, "count" : 1 }
{ "_id" : { "t" : "kenneth.lay@enron.com",
"f" : "a..schroeder@enron.com" }, "count" : 1 }
Query Data
Hadoop Streaming
Same thing but lets use Python
PYTHON FTW!!!
Hadoop Streaming: How it works
> pip install pymongo_hadoop
Install python library
from pymongo_hadoop import BSONMapper
def mapper(documents): i=0
for doc in documents: i=i+1
from_field = doc['headers']['From']
to_field = doc['headers']['To']
recips = [x.strip() for x in to_field.split(',')]
for r in recips:
yield {'_id': {'f':from_field, 't':r}, 'count': 1}
BSONMapper(mapper)
print >> sys.stderr, "Done Mapping."
Map Phase
from pymongo_hadoop import BSONReducer
def reducer(key, values):
print >> sys.stderr, "Processing from/to %s" % str(key)
_count = 0
for v in values:
_count += v['count']
return {'_id': key, 'count': _count}
BSONReducer(reducer)
Reduce Phase
Surviving Hadoop
MapReduce easier w/ PIG and Hive
•  PIG
–  Powerful language
–  Generates sophisticated map/reduce
–  Workflows from simple scripts
•  HIVE
–  Similar to PIG
–  SQL as language
data = LOAD 'mongodb://localhost:27017/db.collection'
using com.mongodb.hadoop.pig.MongoLoader;
STORE records INTO 'file:///output.bson'
using com.mongodb.hadoop.pig.BSONStorage;
MongoDB+Hadoop and PIG
MongoDB + Hadoop and PIG
•  PIG has a some special datatypes
–  Bags
–  Maps
–  Tuples
•  MongoDB+Hadoop Connector converts between PIG
and MongoDB datatypes
raw = LOAD 'hdfs:///messages.bson'
using com.mongodb.hadoop.pig.BSONLoader('','headers:[]') ;
send_recip = FOREACH raw GENERATE $0#'From' as from, $0#'To'
as to;
//filter && split
send_recip_filtered = FILTER send_recip BY to IS NOT NULL;
send_recip_split = FOREACH send_recip_filtered GENERATE
from as from, TRIM(FLATTEN(TOKENIZE(to))) as to;
//group && count
send_recip_grouped = GROUP send_recip_split BY (from, to);
send_recip_counted = FOREACH send_recip_grouped GENERATE
group, COUNT($1) as count;
STORE send_recip_counted INTO 'file:///enron_results.bson' using
com.mongodb.hadoop.pig.BSONStorage;
PIG
Upcoming
Roadmap Features
•  Performance Improvements–Lazy BSON
•  Full-Featured Hive Support
•  Support multi-collection input
•  API for custom splitter implementations
•  And lots more …
Recap
•  Use Hadoop for massive MapReduce computations on
big data sets stored in MongoDB
•  MongoDB can be used as Hadoop filesystem
•  There’s lots of tools to make it easier
–  Streaming
–  Hive
–  PIG
–  EMR
•  https://github.com/mongodb/mongo-hadoop/tree/
master/examples
Barcelona MUG MongoDB + Hadoop Presentation
QA?
MongoDB + Hadoop
Senior Solutions Architect, MongoDB Inc.
Norberto Leite
#bcnmug
Ad

More Related Content

What's hot (20)

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Introduction to df
Introduction to dfIntroduction to df
Introduction to df
Mohit Jaggi
 
MongoDB and Python
MongoDB and PythonMongoDB and Python
MongoDB and Python
Norberto Leite
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
antoinegirbal
 
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim TkachenkoSupercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Altinity Ltd
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
MongoDB
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
MongoDB
 
The Very ^ 2 Basics of R
The Very ^ 2 Basics of RThe Very ^ 2 Basics of R
The Very ^ 2 Basics of R
Winston Chen
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
 
R statistics with mongo db
R statistics with mongo dbR statistics with mongo db
R statistics with mongo db
MongoDB
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
MongoDB
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
MongoDB
 
Python and MongoDB
Python and MongoDB Python and MongoDB
Python and MongoDB
Norberto Leite
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
Sharat Chikkerur
 
Webinar: MongoDB + Hadoop
Webinar: MongoDB + HadoopWebinar: MongoDB + Hadoop
Webinar: MongoDB + Hadoop
MongoDB
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
Jelena Zanko
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
MongoDB
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
Steven Francia
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Nosh Petigara
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 
Introduction to df
Introduction to dfIntroduction to df
Introduction to df
Mohit Jaggi
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
antoinegirbal
 
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim TkachenkoSupercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Altinity Ltd
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
MongoDB
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
MongoDB
 
The Very ^ 2 Basics of R
The Very ^ 2 Basics of RThe Very ^ 2 Basics of R
The Very ^ 2 Basics of R
Winston Chen
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
 
R statistics with mongo db
R statistics with mongo dbR statistics with mongo db
R statistics with mongo db
MongoDB
 
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
MongoDB
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
MongoDB
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
Sharat Chikkerur
 
Webinar: MongoDB + Hadoop
Webinar: MongoDB + HadoopWebinar: MongoDB + Hadoop
Webinar: MongoDB + Hadoop
MongoDB
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
Jelena Zanko
 
Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework Beyond the Basics 2: Aggregation Framework
Beyond the Basics 2: Aggregation Framework
MongoDB
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
Steven Francia
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
MongoDB
 

Similar to Barcelona MUG MongoDB + Hadoop Presentation (20)

Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
Paul Chao
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
Mike Bright
 
hadoop
hadoophadoop
hadoop
longhao
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
Peter Friese
 
Couchbas for dummies
Couchbas for dummiesCouchbas for dummies
Couchbas for dummies
Qureshi Tehmina
 
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
Inhacking
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Аліна Шепшелей
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Ran Silberman
 
Lecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduceLecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduce
anasbro009
 
מיכאל
מיכאלמיכאל
מיכאל
sqlserver.co.il
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Hadoop
HadoopHadoop
Hadoop
devakalyan143
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
gothicane
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Ran Silberman
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
MongoDB
 
Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014Hadoop - MongoDB Webinar June 2014
Hadoop - MongoDB Webinar June 2014
MongoDB
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
Paul Chao
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
Mike Bright
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
Peter Friese
 
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
Inhacking
 
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Аліна Шепшелей
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
christkv
 
Lecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduceLecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduce
anasbro009
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UKIntroduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
gothicane
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
MongoDB
 
Ad

More from Norberto Leite (20)

Data Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivData Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Norberto Leite
 
Avoid Query Pitfalls
Avoid Query PitfallsAvoid Query Pitfalls
Avoid Query Pitfalls
Norberto Leite
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
Norberto Leite
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
Norberto Leite
 
MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016
Norberto Leite
 
Geospatial and MongoDB
Geospatial and MongoDBGeospatial and MongoDB
Geospatial and MongoDB
Norberto Leite
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
Norberto Leite
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
Norberto Leite
 
MongoDB 3.2 Feature Preview
MongoDB 3.2 Feature PreviewMongoDB 3.2 Feature Preview
MongoDB 3.2 Feature Preview
Norberto Leite
 
Mongodb Spring
Mongodb SpringMongodb Spring
Mongodb Spring
Norberto Leite
 
MongoDB on Azure
MongoDB on AzureMongoDB on Azure
MongoDB on Azure
Norberto Leite
 
MongoDB: Agile Combustion Engine
MongoDB: Agile Combustion EngineMongoDB: Agile Combustion Engine
MongoDB: Agile Combustion Engine
Norberto Leite
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
Norberto Leite
 
Spark and MongoDB
Spark and MongoDBSpark and MongoDB
Spark and MongoDB
Norberto Leite
 
Analyse Yourself
Analyse YourselfAnalyse Yourself
Analyse Yourself
Norberto Leite
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
Norberto Leite
 
Effectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEMEffectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEM
Norberto Leite
 
Advanced applications with MongoDB
Advanced applications with MongoDBAdvanced applications with MongoDB
Advanced applications with MongoDB
Norberto Leite
 
MongoDB and Node.js
MongoDB and Node.jsMongoDB and Node.js
MongoDB and Node.js
Norberto Leite
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
Norberto Leite
 
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel AvivData Modelling for MongoDB - MongoDB.local Tel Aviv
Data Modelling for MongoDB - MongoDB.local Tel Aviv
Norberto Leite
 
MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016MongoDB Certification Study Group - May 2016
MongoDB Certification Study Group - May 2016
Norberto Leite
 
Geospatial and MongoDB
Geospatial and MongoDBGeospatial and MongoDB
Geospatial and MongoDB
Norberto Leite
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
Norberto Leite
 
MongoDB 3.2 Feature Preview
MongoDB 3.2 Feature PreviewMongoDB 3.2 Feature Preview
MongoDB 3.2 Feature Preview
Norberto Leite
 
MongoDB: Agile Combustion Engine
MongoDB: Agile Combustion EngineMongoDB: Agile Combustion Engine
MongoDB: Agile Combustion Engine
Norberto Leite
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
Norberto Leite
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
Norberto Leite
 
Effectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEMEffectively Deploying MongoDB on AEM
Effectively Deploying MongoDB on AEM
Norberto Leite
 
Advanced applications with MongoDB
Advanced applications with MongoDBAdvanced applications with MongoDB
Advanced applications with MongoDB
Norberto Leite
 
Ad

Recently uploaded (20)

HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 

Barcelona MUG MongoDB + Hadoop Presentation