MongoDB and Azure Databricks

Download as PPTX, PDF

1 like1,155 views

The document outlines various components and features of Apache Spark, including core APIs like RDDs, DataFrames, and Datasets, as well as tools for machine learning and structured streaming. It highlights the integration of Spark with Azure Databricks for optimized performance and productivity enhancements in data processing and warehousing. Additionally, it covers the environment's support for collaborative workspaces, job scheduling, and bi tool integrations.

Technology

Apache Spark Core APIs
RDDs, DataFrame, Datasets
Spark SQL
GraphX /
GraphFrames
(graph)
Structured
Streaming
Mllib
(machine
learning)
Spark: The Definitive Guide

Managed Apache Spark platform optimized for Azure
Microsoft Azure

Optimized Databricks Runtime Engine
DATABRICKS I/O SERVERLESS
Collaborative Workspace
Cloud storage
Data warehouses
Hadoop storage
IoT / streaming data
Rest APIs
Machine learning models
BI tools
Data exports
Data warehouses
AZURE DATABRICKS
Enhance Productivity
Deploy Production Jobs & Workflows
APACHE SPARK
MULTI-STAGE PIPELINES
DATA ENGINEER
JOB SCHEDULER NOTIFICATION & LOGS
DATA SCIENTIST BUSINESS ANALYST
Build on secure & trusted cloud Scale without limits

Executor0
TASKTASK
Executor7
TASKTASK…
Master
SparkConnSparkConnSparkConnSparkConn
Primary
Secondary Secondary

Official Apache Spark website
Azure Databricks Documentation
MongoDB Connector for Apache Spark

More Related Content

What's hot (20)

PPTX

Azure DataBricks for Data Engineering by Eugene PolonichkoDimko Zhluktenko

PPTX

Azure Synapse Analytics Overview (r2)James Serra

PPTX

Architecting a datalakeLaurent Leturgez

PDF

Azure data analytics platform - A reference architecture Rajesh Kumar

PPTX

Azure Synapse Analytics Overview (r1)James Serra

PPTX

Azure data bricks by Eugene PolonichkoAlex Tumanoff

PPTX

Building Modern Data Platform with Microsoft AzureDmitry Anoshin

PPTX

Introduction to Azure DatabricksJames Serra

PDF

대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...Amazon Web Services Korea

PDF

Introducing Databricks DeltaDatabricks

PDF

Modernizing to a Cloud Data ArchitectureDatabricks

PDF

Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent

PDF

Azure SQL Database Managed Instance - technical overviewGeorge Walters

PPTX

Introduction to Data EngineeringDurga Gadiraju

PDF

Building End-to-End Delta Pipelines on GCPDatabricks

PPTX

Modern data warehouseRakesh Jayaram

PPTX

Data meshManojKumarR41

PPTX

Master the Multi-Clustered Data Warehouse - SnowflakeMatillion

PPTX

Azure Databricks - An Introduction (by Kris Bock)Daniel Toomey

PPT

Data Architecture for Data GovernanceDATAVERSITY

Azure DataBricks for Data Engineering by Eugene PolonichkoDimko Zhluktenko

Azure Synapse Analytics Overview (r2)James Serra

Architecting a datalakeLaurent Leturgez

Azure data analytics platform - A reference architecture Rajesh Kumar

Azure Synapse Analytics Overview (r1)James Serra

Azure data bricks by Eugene PolonichkoAlex Tumanoff

Building Modern Data Platform with Microsoft AzureDmitry Anoshin

Introduction to Azure DatabricksJames Serra

대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...Amazon Web Services Korea

Introducing Databricks DeltaDatabricks

Modernizing to a Cloud Data ArchitectureDatabricks

Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent

Azure SQL Database Managed Instance - technical overviewGeorge Walters

Introduction to Data EngineeringDurga Gadiraju

Building End-to-End Delta Pipelines on GCPDatabricks

Modern data warehouseRakesh Jayaram

Data meshManojKumarR41

Master the Multi-Clustered Data Warehouse - SnowflakeMatillion

Azure Databricks - An Introduction (by Kris Bock)Daniel Toomey

Data Architecture for Data GovernanceDATAVERSITY

Similar to MongoDB and Azure Databricks (20)

PPTX

Building Advanced Analytics Pipelines with Azure DatabricksLace Lofranco

PDF

Spark as a Service with Azure DatabricksLace Lofranco

PDF

Fighting Fraud with Apache SparkMiklos Christine

PDF

Apache spark 2.4 and beyondXiao Li

PDF

Started with-apache-sparkHappiest Minds Technologies

PPTX

Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsfsasuke20y4sh

PDF

Apache Spark and Python: unified Big Data analyticsJulien Anguenot

PPTX

Apache sparkPrashant Pranay

PPTX

Large Scale Machine learning with SparkMd. Mahedi Kaysar

PPTX

Large-Scale Data Science in Apache Spark 2.0Databricks

PDF

Apache Spark - A High Level overviewKaran Alang

PPTX

TechEvent Databricks on AzureTrivadis

PDF

Bds session 13 14Infinity Tech Solutions

PPTX

Azure Databricks - An Introduction 2019 Roadshow.pptxpascalsegoul

PPTX

Getting started with SparkSQL - Desert Code Camp 2016clairvoyantllc

PDF

Apache sparkHitesh Dua

PDF

Jumpstart on Apache Spark 2.2 on DatabricksDatabricks

PDF

Jump Start on Apache® Spark™ 2.x with Databricks Databricks

PPTX

Building highly scalable data pipelines with Apache SparkMartin Toshev

PPTX

Spark Concepts - Spark SQL, Graphx, StreamingPetr Zapletal

Building Advanced Analytics Pipelines with Azure DatabricksLace Lofranco

Spark as a Service with Azure DatabricksLace Lofranco

Fighting Fraud with Apache SparkMiklos Christine

Apache spark 2.4 and beyondXiao Li

Started with-apache-sparkHappiest Minds Technologies

Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsfsasuke20y4sh

Apache Spark and Python: unified Big Data analyticsJulien Anguenot

Apache sparkPrashant Pranay

Large Scale Machine learning with SparkMd. Mahedi Kaysar

Large-Scale Data Science in Apache Spark 2.0Databricks

Apache Spark - A High Level overviewKaran Alang

TechEvent Databricks on AzureTrivadis

Bds session 13 14Infinity Tech Solutions

Azure Databricks - An Introduction 2019 Roadshow.pptxpascalsegoul

Getting started with SparkSQL - Desert Code Camp 2016clairvoyantllc

Apache sparkHitesh Dua

Jumpstart on Apache Spark 2.2 on DatabricksDatabricks

Jump Start on Apache® Spark™ 2.x with Databricks Databricks

Building highly scalable data pipelines with Apache SparkMartin Toshev

Spark Concepts - Spark SQL, Graphx, StreamingPetr Zapletal

More from MongoDB (20)

PDF

MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB

PDF

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB

PDF

MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB

PDF

MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB

PDF

MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB

PDF

MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB

PDF

MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB

PDF

MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB

PDF

MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB

PDF

MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB

PDF

MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB

PDF

MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB

PDF

MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB

PDF

MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB

PDF

MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB

PDF

MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB

PDF

MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB

PDF

MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB

PDF

MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB

PDF

MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB