SlideShare a Scribd company logo
Database Choices
@LynnLangit
May 2014 – Techorama
Databases Now -> a Menu of Choices
Why Change? ->”Small” Big Data
Your data -
BEHAVIORAL
Your data -
TRANSACTIONAL
PUBLIC data
PREMIUM
data
Current Data Questions
• “Should we evaluate Hadoop?”
• “How much data is Big Data?”
• “What are the limits of SQL Server?”
• “Which NoSQL databases (if any) should we consider?”
• “How safe is the cloud really?”
• “How do we mine the data for usable information?”
5
6
DEMO - About Open Source
• Free • Not Free
 Rapid iteration, innovation
 Can start up for free (on premise)
 Can ‘rent’ for cheap or free on the cloud
 Can use with the command line for free
 Some vendors offer free online training
 Ex. www.neo4j.org
 Constant releases
 Can be deceptively hard to set up (time is
money)
 Don’t forget to turn it off if on the cloud!
 GUI tools, support, training cost $$$
 Ex. www.neo4j.com
Database Choices – The first level of choice
Data
A.
Hadoop
B. NoSQL
C.
Relational
On Premise or In the Cloud
Working with Hadoop
About Hadoop MapReduce
HDFS
How you ‘get’ Hadoop
•roll your own
A. Open source
•Cloudera
•MapR
•Hortonworks
•More…
B. Commercial distribution
•AWS
•HDInsight
C. Rent it via the cloud
11
Demo - Cloudera Hadoop Enterprise
Database Choices
Demo – AWS MapReduce
Example Comparison: RDBMS vs. Hadoop
Traditional RDBMS Hadoop / MapReduce
Data Size Gigabytes (Terabytes) Petabytes and greater
Access Interactive and Batch Batch – NOT Interactive
Updates Read / Write many times Write once, Read many times
Structure Static Schema Dynamic Schema
Integrity High (ACID) Low
Scaling Nonlinear Linear
Query Response
Time
Can be near immediate Has latency (due to batch processing)
15
Database Choices
On Premise
• RDBMS
• NoSQL
• Hadoop
In Cloud
• RDBMS
• NoSQL
• Hadoop
An Aside…SQL Server 2012++ ‘NoSQL’
• SQL Server 2012 Columnstore Index
• SQL Server 2012 Tabular Model (SSAS)
2012 2014
SSAS Tabular Models X X
NC Columnstore Index X X
Clustered (writable)
Columnstore Index
X
In-memory OLTP X
But wait…
is there a
RELATIONAL database
that scales,
that is cheap,
that runs in the cloud?
DEMO - AWS Redshift
• About $1k per Terabyte per year - relational
So many NoSQL options
• More than just the Elephant in the room
• Over 150+ types of NoSQL databases
Flavors of NoSQL
Key/Value
Volatile
Key/value
Persistent
Wide-Column Document Graph
Key / Value Database
• Just keys and values
– No schema
• Persistent or Volatile
• Examples
– AWS Dynamo DB
– Riak
DEMO - AWS DynamoDB
• Key/Value store on the AWS cloud
File (BLOB) Storage Buckets in the Cloud
• Amazon – S3 or Glacier
• Google – Cloud Storage
• Microsoft Azure BLOBS
DEMO - Battle of the Buckets
• Google Cloud Storage VS.
• Windows Azure BLOBS VS.
• AWS S3  (Archiving) in to AWS Glacier
Column Database
• Wide, sparse column sets
• Schema-light
• Examples:
– HBase w/Hadoop
– Google Cloud Datastore
– SQL Server Columnstore Indexes or SSAS Tabular
Models
Types of Column Databases
• Column-families
– Non-relational
– Sparse
– Examples:
• HBase
• Cassandra
• xVelocity (SQL 2012 Tabular)
• Column-stores
– Relational
– Dense
– Example:
• SQL Server 2012 Columnstore index
DEMO – Google Cloud Datastore
DEMO – SQL Server ‘NoSQL’
• SQL Server Columnstore Index
• SQL Server SSAS Tabular Model
Document Database
• document-oriented (collection of
JSON documents) w/semi structured
data
– Encodings include BSON, JSON,
XML…
• binary forms
– PDF, Microsoft Office documents --
Word, Excel…)
• Examples:
– MongoDB
– Couchbase
Demo - MongoDB
Graph Databases
• a lot of many-to-many relationships
• recursive self-joins
• when your primary objective is quickly finding
connections, patterns and relationships
between the objects within lots of data
• Examples:
– Neo4j
– AlgebraixData
– Google Freebase
DEMO – Neo4J
Cloud-hosted, partially managed RDBMS
• AWS RDS
– SQL Server
– MySQL
– PostgreSQL
– Oracle
• Google
– MySQL
• Microsoft
– SQLAzure
DEMO - AWS RDS
• SQL Server, MySQL or Oracle
• Essential to understand pricing models
NoSQL Applied
Log Files
•Columnstore
•HBase
Product
Catalogs
•Key/Value
•DynamoDB
Social Games
•Document
•MongoDB
Social
aggregators
•Graph
•Neo4j
Line-of-
Business
•RDBMS
•SQL Server
Cloud Offerings– RDBMS AND NoSQL
AWS Google Microsoft
Managed RDBMS RDS – all major RDBMS Cloud SQL SQL Azure
NoSQL buckets S3 or Glacier Cloud Storage Azure Blobs
NoSQL Key-Value DynamoDB Cloud Datastore Azure Tables
Streaming or ML Kinesis Prospective Search &
Prediction API
StreamInsight
NoSQL Document or Graph MongoDB on EC2
Neo4j on EC2
None
Freebase
MongoDB on Microsoft Cloud
Neo4j on Microsoft Cloud
Hadoop (HBase) Elastic MapReduce (S3 & EC2) None HDInsight
Dremel/Warehousing RedShift BigQuery None
Cloud ETL Data Pipelines None None
But wait…
how do I query
NoSQL data?
Example – translate ANSI SQL to MapReduce
Can Excel help?
Connector to
Hadoop
Power BI
Data Quality
Services
Master Data
Services
Integration
with Azure
Data Market
Data Mining
w/Predixion
Demo – Excel Power Query
NoSQL To-Do List
Understand types of NoSQL databases
• Use NoSQL when business needs designate
• Use the right type of NoSQL for your business problem
Try out NoSQL on the cloud
• Quick and cheap for behavioral data
• Mashup cloud datasets
• Good for specialized use cases, i.e. dev, test , training environments
Learn NoSQL access technologies & services
• New query languages, i.e. MapReduce, R, Infer.NET
• New query tools (vendor-specific) – Google Refine, Amazon Karmasphere, Microsoft Excel
connectors, etc…
• Windows Azure Data Market, other public data markets
www.TeachingKidsProgramming.org
• Free Courseware (Java, Small Basic or C# [on Pluralsight])
• Do a Recipe  Teach a Kid (Ages 10 ++)
• recipes)
43
A Big Thank You To Our Sponsors
Gold Partners
Silver & Track Partners
Platinum Partners
Ad

More Related Content

What's hot (19)

Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft Stack
Lynn Langit
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Fwdays
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
Grant Fritchey
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
Lynn Langit
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge Databases
Lynn Langit
 
Analyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data LakeAnalyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data Lake
BizTalk360
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
Databricks
 
Introduction to Dremio
Introduction to DremioIntroduction to Dremio
Introduction to Dremio
Dremio Corporation
 
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Windows Developer
 
Azure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginnersAzure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginners
Michaela Murray
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
Vincent Terrasi
 
Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
Stylight
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
Nilesh Gule
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
BizTalk360
 
Cortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ MicrosoftCortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ Microsoft
MSAdvAnalytics
 
REDSHIFT - Amazon
REDSHIFT - AmazonREDSHIFT - Amazon
REDSHIFT - Amazon
Douglas Bernardini
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
Sergio Zenatti Filho
 
Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
Ike Ellis
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft Stack
Lynn Langit
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Fwdays
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
Grant Fritchey
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
Lynn Langit
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge Databases
Lynn Langit
 
Analyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data LakeAnalyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data Lake
BizTalk360
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
Databricks
 
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Windows Developer
 
Azure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginnersAzure SQL Data Warehouse for beginners
Azure SQL Data Warehouse for beginners
Michaela Murray
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
Vincent Terrasi
 
Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
Stylight
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
Nilesh Gule
 
A lap around Azure Data Factory
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
BizTalk360
 
Cortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ MicrosoftCortana Analytics Workshop: Big Data @ Microsoft
Cortana Analytics Workshop: Big Data @ Microsoft
MSAdvAnalytics
 
Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
Ike Ellis
 

Similar to Database Choices (20)

NoSQL
NoSQLNoSQL
NoSQL
dbulic
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
PolarSeven Pty Ltd
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 
Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5
Mike King
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
Lynn Langit
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
Clarence J M Tauro
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
Rajesh Menon
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
Partha Das
 
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit
 
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
PASS_Summit_2019_Azure_Storage_Options_for_AnalyticsPASS_Summit_2019_Azure_Storage_Options_for_Analytics
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
Dustin Vannoy
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
Martin Bém
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
Shreyashkumar Nangnurwar
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
Dremio Corporation
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17
Neal Davis
 
Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
CCG
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
lucenerevolution
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Ian Massingham
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and Practice
Felix Gessert
 
Nosql databases for the .net developer
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez
 
Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5Nashville analytics summit aug9 no sql mike king dell v1.5
Nashville analytics summit aug9 no sql mike king dell v1.5
Mike King
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
Lynn Langit
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
James Serra
 
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit
 
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
PASS_Summit_2019_Azure_Storage_Options_for_AnalyticsPASS_Summit_2019_Azure_Storage_Options_for_Analytics
PASS_Summit_2019_Azure_Storage_Options_for_Analytics
Dustin Vannoy
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
Martin Bém
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
Dremio Corporation
 
AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17AWS Certified Cloud Practitioner Course S11-S17
AWS Certified Cloud Practitioner Course S11-S17
Neal Davis
 
Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
CCG
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Fwdays
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
lucenerevolution
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Ian Massingham
 
Cloud Databases in Research and Practice
Cloud Databases in Research and PracticeCloud Databases in Research and Practice
Cloud Databases in Research and Practice
Felix Gessert
 
Ad

More from Lynn Langit (20)

VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
Lynn Langit
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
Lynn Langit
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
Lynn Langit
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on Docker
Lynn Langit
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
Lynn Langit
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
Lynn Langit
 
Practical cloud
Practical cloudPractical cloud
Practical cloud
Lynn Langit
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examples
Lynn Langit
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
Lynn Langit
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
Lynn Langit
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
Lynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
Lynn Langit
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
Lynn Langit
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
Lynn Langit
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
Lynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
Lynn Langit
 
New AWS Services for Bioinformatics
New AWS Services for BioinformaticsNew AWS Services for Bioinformatics
New AWS Services for Bioinformatics
Lynn Langit
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
Lynn Langit
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
Lynn Langit
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud Platform
Lynn Langit
 
VariantSpark on AWS
VariantSpark on AWSVariantSpark on AWS
VariantSpark on AWS
Lynn Langit
 
Serverless Architectures
Serverless ArchitecturesServerless Architectures
Serverless Architectures
Lynn Langit
 
10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming10+ Years of Teaching Kids Programming
10+ Years of Teaching Kids Programming
Lynn Langit
 
Blastn plus jupyter on Docker
Blastn plus jupyter on DockerBlastn plus jupyter on Docker
Blastn plus jupyter on Docker
Lynn Langit
 
Testing in Ballerina Language
Testing in Ballerina LanguageTesting in Ballerina Language
Testing in Ballerina Language
Lynn Langit
 
Teaching Kids to create Alexa Skills
Teaching Kids to create Alexa SkillsTeaching Kids to create Alexa Skills
Teaching Kids to create Alexa Skills
Lynn Langit
 
Understanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examplesUnderstanding Jupyter notebooks using bioinformatics examples
Understanding Jupyter notebooks using bioinformatics examples
Lynn Langit
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
Lynn Langit
 
Teaching Kids Programming
Teaching Kids ProgrammingTeaching Kids Programming
Teaching Kids Programming
Lynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
Lynn Langit
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
Lynn Langit
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
Lynn Langit
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
Lynn Langit
 
Serverless Reality
Serverless RealityServerless Reality
Serverless Reality
Lynn Langit
 
New AWS Services for Bioinformatics
New AWS Services for BioinformaticsNew AWS Services for Bioinformatics
New AWS Services for Bioinformatics
Lynn Langit
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
Lynn Langit
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
Lynn Langit
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud Platform
Lynn Langit
 
Ad

Recently uploaded (20)

Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 

Database Choices

  • 2. Databases Now -> a Menu of Choices
  • 3. Why Change? ->”Small” Big Data Your data - BEHAVIORAL Your data - TRANSACTIONAL PUBLIC data PREMIUM data
  • 4. Current Data Questions • “Should we evaluate Hadoop?” • “How much data is Big Data?” • “What are the limits of SQL Server?” • “Which NoSQL databases (if any) should we consider?” • “How safe is the cloud really?” • “How do we mine the data for usable information?”
  • 5. 5
  • 6. 6 DEMO - About Open Source • Free • Not Free  Rapid iteration, innovation  Can start up for free (on premise)  Can ‘rent’ for cheap or free on the cloud  Can use with the command line for free  Some vendors offer free online training  Ex. www.neo4j.org  Constant releases  Can be deceptively hard to set up (time is money)  Don’t forget to turn it off if on the cloud!  GUI tools, support, training cost $$$  Ex. www.neo4j.com
  • 7. Database Choices – The first level of choice Data A. Hadoop B. NoSQL C. Relational On Premise or In the Cloud
  • 10. How you ‘get’ Hadoop •roll your own A. Open source •Cloudera •MapR •Hortonworks •More… B. Commercial distribution •AWS •HDInsight C. Rent it via the cloud
  • 11. 11 Demo - Cloudera Hadoop Enterprise
  • 13. Demo – AWS MapReduce
  • 14. Example Comparison: RDBMS vs. Hadoop Traditional RDBMS Hadoop / MapReduce Data Size Gigabytes (Terabytes) Petabytes and greater Access Interactive and Batch Batch – NOT Interactive Updates Read / Write many times Write once, Read many times Structure Static Schema Dynamic Schema Integrity High (ACID) Low Scaling Nonlinear Linear Query Response Time Can be near immediate Has latency (due to batch processing)
  • 15. 15 Database Choices On Premise • RDBMS • NoSQL • Hadoop In Cloud • RDBMS • NoSQL • Hadoop
  • 16. An Aside…SQL Server 2012++ ‘NoSQL’ • SQL Server 2012 Columnstore Index • SQL Server 2012 Tabular Model (SSAS) 2012 2014 SSAS Tabular Models X X NC Columnstore Index X X Clustered (writable) Columnstore Index X In-memory OLTP X
  • 17. But wait… is there a RELATIONAL database that scales, that is cheap, that runs in the cloud?
  • 18. DEMO - AWS Redshift • About $1k per Terabyte per year - relational
  • 19. So many NoSQL options • More than just the Elephant in the room • Over 150+ types of NoSQL databases
  • 21. Key / Value Database • Just keys and values – No schema • Persistent or Volatile • Examples – AWS Dynamo DB – Riak
  • 22. DEMO - AWS DynamoDB • Key/Value store on the AWS cloud
  • 23. File (BLOB) Storage Buckets in the Cloud • Amazon – S3 or Glacier • Google – Cloud Storage • Microsoft Azure BLOBS
  • 24. DEMO - Battle of the Buckets • Google Cloud Storage VS. • Windows Azure BLOBS VS. • AWS S3  (Archiving) in to AWS Glacier
  • 25. Column Database • Wide, sparse column sets • Schema-light • Examples: – HBase w/Hadoop – Google Cloud Datastore – SQL Server Columnstore Indexes or SSAS Tabular Models
  • 26. Types of Column Databases • Column-families – Non-relational – Sparse – Examples: • HBase • Cassandra • xVelocity (SQL 2012 Tabular) • Column-stores – Relational – Dense – Example: • SQL Server 2012 Columnstore index
  • 27. DEMO – Google Cloud Datastore
  • 28. DEMO – SQL Server ‘NoSQL’ • SQL Server Columnstore Index • SQL Server SSAS Tabular Model
  • 29. Document Database • document-oriented (collection of JSON documents) w/semi structured data – Encodings include BSON, JSON, XML… • binary forms – PDF, Microsoft Office documents -- Word, Excel…) • Examples: – MongoDB – Couchbase
  • 31. Graph Databases • a lot of many-to-many relationships • recursive self-joins • when your primary objective is quickly finding connections, patterns and relationships between the objects within lots of data • Examples: – Neo4j – AlgebraixData – Google Freebase
  • 33. Cloud-hosted, partially managed RDBMS • AWS RDS – SQL Server – MySQL – PostgreSQL – Oracle • Google – MySQL • Microsoft – SQLAzure
  • 34. DEMO - AWS RDS • SQL Server, MySQL or Oracle • Essential to understand pricing models
  • 35. NoSQL Applied Log Files •Columnstore •HBase Product Catalogs •Key/Value •DynamoDB Social Games •Document •MongoDB Social aggregators •Graph •Neo4j Line-of- Business •RDBMS •SQL Server
  • 36. Cloud Offerings– RDBMS AND NoSQL AWS Google Microsoft Managed RDBMS RDS – all major RDBMS Cloud SQL SQL Azure NoSQL buckets S3 or Glacier Cloud Storage Azure Blobs NoSQL Key-Value DynamoDB Cloud Datastore Azure Tables Streaming or ML Kinesis Prospective Search & Prediction API StreamInsight NoSQL Document or Graph MongoDB on EC2 Neo4j on EC2 None Freebase MongoDB on Microsoft Cloud Neo4j on Microsoft Cloud Hadoop (HBase) Elastic MapReduce (S3 & EC2) None HDInsight Dremel/Warehousing RedShift BigQuery None Cloud ETL Data Pipelines None None
  • 37. But wait… how do I query NoSQL data?
  • 38. Example – translate ANSI SQL to MapReduce
  • 39. Can Excel help? Connector to Hadoop Power BI Data Quality Services Master Data Services Integration with Azure Data Market Data Mining w/Predixion
  • 40. Demo – Excel Power Query
  • 41. NoSQL To-Do List Understand types of NoSQL databases • Use NoSQL when business needs designate • Use the right type of NoSQL for your business problem Try out NoSQL on the cloud • Quick and cheap for behavioral data • Mashup cloud datasets • Good for specialized use cases, i.e. dev, test , training environments Learn NoSQL access technologies & services • New query languages, i.e. MapReduce, R, Infer.NET • New query tools (vendor-specific) – Google Refine, Amazon Karmasphere, Microsoft Excel connectors, etc… • Windows Azure Data Market, other public data markets
  • 42. www.TeachingKidsProgramming.org • Free Courseware (Java, Small Basic or C# [on Pluralsight]) • Do a Recipe  Teach a Kid (Ages 10 ++) • recipes)
  • 43. 43 A Big Thank You To Our Sponsors Gold Partners Silver & Track Partners Platinum Partners

Editor's Notes

  • #3: http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks
  • #9: http://hortonworks.com/technology/hortonworksdataplatform/ More about Hbase, from the O’Reilly ‘Getting Ready for BigData’ report “Enter HBase, a column-oriented database that runs on top of HDFS. Modeled after Google’s BigTable, the project’s goal is to host billions of rows of data for rapid access. MapReduce can use HBase as both a source and a destination for its computations, and Hive and Pig can be used in combination with HBase. In order to grant random access to the data, HBase does impose a few restrictions: performance with Hive is 4-5 times slower than plain HDFS, and the maximum amount of data you can store is approximately a petabyte, versus HDFS’ limit of over 30PB.” http://www.cloudera.com/
  • #10: http://hortonworks.com/technology/hortonworksdataplatform/ More about Hbase, from the O’Reilly ‘Getting Ready for BigData’ report “Enter HBase, a column-oriented database that runs on top of HDFS. Modeled after Google’s BigTable, the project’s goal is to host billions of rows of data for rapid access. MapReduce can use HBase as both a source and a destination for its computations, and Hive and Pig can be used in combination with HBase. In order to grant random access to the data, HBase does impose a few restrictions: performance with Hive is 4-5 times slower than plain HDFS, and the maximum amount of data you can store is approximately a petabyte, versus HDFS’ limit of over 30PB.” http://www.cloudera.com/
  • #12: http://www.cloudera.com/content/cloudera/en/products-and-services/cloudera-live.html
  • #13: http://www.cloudera.com/content/cloudera-content/cloudera-docs/DemoVMs/Cloudera-QuickStart-VM/cloudera_quickstart_vm.html
  • #15: Original Reference: Tom White’s Hadoop: The Definitive Guide (I made some modifications based on my experience)
  • #20: http://nosql-database.org/ http://hadoop.apache.org/ & http://www.mongodb.org/ Wikipedia - http://en.wikipedia.org/wiki/NoSQL List of noSQL databases – http://nosql-database.org/ The good, the bad - http://www.techrepublic.com/blog/10things/10-things-you-should-know-about-nosql-databases/1772
  • #21: http://bigdatanerd.wordpress.com/2012/01/04/why-nosql-part-2-overview-of-data-modelrelational-nosql/ http://docs.jboss.org/hibernate/ogm/3.0/reference/en-US/html_single/
  • #22: http://en.wikipedia.org/wiki/Project_Voldemort http://aws.amazon.com/ http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/Introduction.html http://www.allthingsdistributed.com/2012/01/amazon-dynamodb.html
  • #24: http://code.google.com Access via REST APIs Very Cheap, but not much functionality included Lots of code to write for application development But…can be a good backup solution
  • #26: http://googledevelopers.blogspot.com/2014/01/get-started-with-google-cloud-platform.html http://stage.hypertable.com/index.php/documentation/architecture/ http://code.google.com/appengine/ http://code.google.com/appengine/articles/datastore/overview.html
  • #27: http://cwebbbi.wordpress.com/2012/02/14/so-what-is-the-bi-semantic-model/ http://www.databasejournal.com/features/mssql/understanding-new-column-store-index-of-sql-server-2012.html http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html http://ayende.com/blog/4500/that-no-sql-thing-column-family-databases
  • #28: https://developers.google.com/datastore/docs/concepts/overview http://googledevelopers.blogspot.com/2014/01/get-started-with-google-cloud-platform.html
  • #30: http://en.wikipedia.org/wiki/MongoDB http://www.mongodb.org/downloads http://www.mongodb.org/display/DOCS/Drivers
  • #31: http://en.wikipedia.org/wiki/MongoDB & http://try.mongodb.org/ http://www.mongodb.org/downloads http://www.mongodb.org/display/DOCS/Drivers
  • #32: http://www.infinitegraph.com/what-is-a-graph-database.html and http://www.neo4j.org/ http://en.wikipedia.org/wiki/Graph_database http://www.freebase.com/
  • #33: http://www.neo4j.org/learn/try
  • #34: For Google - http://code.google.com For AWS - https://console.aws.amazon.com/console/home
  • #37: Hadoop on AWS - http://wiki.apache.org/hadoop/AmazonEC2
  • #39: http://rickosborne.org/download/SQL-to-MongoDB.pdf
  • #41: http://www.microsoft.com/en-us/bi/default.aspx http://dennyglee.com/ Demos -   http://www.youtube.com/watch?v=djfpPsGwm6A and http://www.youtube.com/watch?v=uh9bKWO1K7U
  • #43: Lynn