SlideShare a Scribd company logo
Marco Parenzan
marco [dot] parenzan [at] 1nn0va [dot] it
www.marcoparenzan.com
Marco Parenzan
Solution Sales Specialist in Insight for Digital Innovation
Azure MVP
Community Lead 1nn0va // Pordenone
Linkedin: https://www.linkedin.com/in/marcoparenzan/
Road to be a newbie “Data *”
for a seasoned C# dev
in a Pythonic world
Data Processing in a Azure World...
• Thousands of IoT sensors in a factory,
producing petabytes of data
• Started with Stream Analytics...
• Like Azure Data Explorer...
• Data Warehouse...
• ...but there is a standard? Yes!
Apache Spark
The data c ompute exper ienc e
Spark Unifies:
 Batch Processing
 Interactive SQL
 Real-time processing
 Machine Learning
 Deep Learning
 Graph Processing
An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
http://spark.apache.org
Apache Spark
Data Sources (HDFS, SQL, NoSQL, …)
Cluster Manager
Node Node Node
Cache Cache Cache
Driver Program
SparkContext
General Spark Cluster Architecture
• ‘Driver’ runs the user’s ‘main’ function
and executes the various parallel
operations on the worker nodes.
• The results of the operations are
collected by the driver
• The worker nodes read and write data
from/to Data Sources including HDFS.
• Worker node also cache transformed
data in memory as RDDs (Resilient Data
Sets).
• Worker nodes and the Driver Node
execute as VMs in public clouds (AWS,
Google and Azure).
Elements of Spark
User’s
Program
Spark
Session
Task1 Task 2
Task 3 Task 4
Driver Executor
Cluster Manager
Read from
HDFS
Write to
HDFS
Read from
HDFS
Write to
HDFS
Read from
HDFS
What makes Spark fast
DataFrame as the core of Spark
• Recipe:
• Create Session
• Create Dataframe
• Define a user defined function
• Manipulate and view Data
CSV Data
JSON Data
RDBMS Data
Parquet Data
Binary Data
DataFrame
User programs
against the
DataFrame
abstraction
What about .NET?
• In a recent survey, more than 70% of .NET devs expressed interest in
Apache Spark
• Millions of lines of big data-usable business logic are written in .NET
• But .NET devs are locked out from big data processing – lack of .NET
support in OSS big data solutions
• We want a first-class .net experience in Spark
.NET for Apache Spark
• .NET bindings (C# e F#) to Spark
• Written on the Spark interop layer,
designed to provide high performance
bindings to multiple languages
• Re-use knowledge, skills, code you
have as a .NET developer
• Compliant with .NET Standard
• You can use .NET for Apache Spark
anywhere you write .NET code
• Original project Moebius
• https://github.com/microsoft/Mobius
.NET Spark support
Spark
DataFramews with
SparkSQL
• Spark 2.3.x,
2.4.x, 3.0
• ~300 SparkSQL
function
• DeltaLake
.NET Standard 2.0
• C#/F#
• .NET Framework
4.6.1+
• .NET Core 2.1+
Batch&Streaming
• Structured
Streaming
Data Science
• ML.NET
• Notebooks
Using .NET for Spark
• Get started with .NET for Apache
Spark | Microsoft Docs
• https://docs.microsoft.com/en-
us/dotnet/spark/tutorials/get-
started?tabs=windows
• Install .NET
• Install Java
• Install Apache Spark
• Install .NET for Apache Spark
• Create your app
• Install NuGet package
Batch vs. Notebooks
• Work on slow data stored into a
Datalake
• Submit a complete app in one
single deploy
• Receive the entire output
• Use «spart-submit»
• Run cell by cell (but also all at
once)
The .NET Notebook
experience
Evolution of REPL
• At the beginning there was mono
• Then Dynamic/DLR (C# 4)
• C#/F# interactive
• .NET Try
• In a world of:
• Python
• Mathematica
Jupyter
• Evolution and generalization of the seminal role of Mathematica (notebook)
• +Python adoption (ipynb)
• +Web (HTTP+Markdown)
• +Kernel
Demo
The docker
experience
Run Spark with .NET in a container
• Not an official Microsoft image
• https://hub.docker.com/r/3rdman/dotnet-spark
• Install nothing on your machine other than docker
• Launch and Debug your code from Visual Studio and Visual Studio Code
• Very good for development
Demo
The Azure Synapse
Analytics Experience
Engine for business-changing insights with seamless
ecosystem integration
Azure
Synapse Analytics
Data
integration
Data
warehousing
Big data
processing
Azure Data Lake Storage + Common Data Model
Azure Synapse Analytics
Limitless analytics service with unmatched time to insight
Synapse Analytics
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
Data lake integrated and
Common Data Model aware
METASTORE
SECURITY
MANAGEMENT
MONITORING
Integrated platform services
for, management, security,
monitoring, and meta-store
DATA INTEGRATION
Analytics Runtimes
Integrated analytics runtimes
available provisioned and
serverless on-demand
Synapse SQL offering T-SQL for
batch, streaming and interactive
processing
Apache Spark for big data
processing with Python, Scala and
.NET
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Multiple languages suited to
different analytics workloads
Experience Synapse Analytics Studio
SaaS developer experiences for
code free and code first
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
Designed for analytics workloads
at any scale
METASTORE
SECURITY
MANAGEMENT
MONITORING
Manage – Apache Spark pools
• Overview
• Provides ability to Pause, Scale, Assign Tags and upload packages from Studio.
.NET for Azure Synapse (and viceversa)
Spark ML
Algorithms
Spark ML Algorithms
Synapse Service
Job Service Frontend
Spark API
Controller …
Job Service Backend
Spark Plugin
Gateway
Resource
Provider
DB
Synapse Studio
AAD
Auth Service
Instance
Creation Service
DBDB
Azure
Spark Instance
VM VM VM VM VM
…
VM
Synapse Job Service
Develop Hub - Notebooks
• Notebooks
• Allows to write multiple
languages in one notebook
• %%<Name of language>
• Offers use of temporary tables
across languages
• Language support for Syntax
highlight, syntax error, syntax
code completion, smart indent,
code folding
• Export results
Demo
Conclusion
Add a footer 32
Add a footer 33
Conclusion
• Spark is here to stay
• Now it is a place for .NET skills too
• Azure Synapse Analytics the best fusion between the old and the new world
Ad

More Related Content

What's hot (20)

An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
Databricks
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Spark Summit
 
Big data knolx
Big data knolxBig data knolx
Big data knolx
Knoldus Inc.
 
Shifting Data Science into High Gear
Shifting Data Science into High GearShifting Data Science into High Gear
Shifting Data Science into High Gear
Spark Summit
 
Uber's data science workbench
Uber's data science workbenchUber's data science workbench
Uber's data science workbench
Ran Wei
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured Streaming
Databricks
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
Hao Wang
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark Scala
Knoldus Inc.
 
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
Databricks
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
 
Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008
Eduardo Castro
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Databricks
 
xPatterns - Spark Summit 2014
xPatterns - Spark Summit   2014xPatterns - Spark Summit   2014
xPatterns - Spark Summit 2014
Claudiu Barbura
 
Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber Attacks
Databricks
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
MLflow on and inside Azure
MLflow on and inside AzureMLflow on and inside Azure
MLflow on and inside Azure
Databricks
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
 
Bridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on DatabricksBridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on Databricks
Databricks
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark Summit
 
Spark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat PattersonSpark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat Patterson
Spark Summit
 
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
An Update on Scaling Data Science Applications with SparkR in 2018 with Heiko...
Databricks
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Spark Summit
 
Shifting Data Science into High Gear
Shifting Data Science into High GearShifting Data Science into High Gear
Shifting Data Science into High Gear
Spark Summit
 
Uber's data science workbench
Uber's data science workbenchUber's data science workbench
Uber's data science workbench
Ran Wei
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured Streaming
Databricks
 
Spark at Airbnb
Spark at AirbnbSpark at Airbnb
Spark at Airbnb
Hao Wang
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark Scala
Knoldus Inc.
 
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, ...
Databricks
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
Timothy Spann
 
Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008
Eduardo Castro
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Databricks
 
xPatterns - Spark Summit 2014
xPatterns - Spark Summit   2014xPatterns - Spark Summit   2014
xPatterns - Spark Summit 2014
Claudiu Barbura
 
Scaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber AttacksScaling ML-Based Threat Detection For Production Cyber Attacks
Scaling ML-Based Threat Detection For Production Cyber Attacks
Databricks
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
MLflow on and inside Azure
MLflow on and inside AzureMLflow on and inside Azure
MLflow on and inside Azure
Databricks
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
 
Bridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on DatabricksBridging the Completeness of Big Data on Databricks
Bridging the Completeness of Big Data on Databricks
Databricks
 
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark Summit
 
Spark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat PattersonSpark Summit EU talk by Pat Patterson
Spark Summit EU talk by Pat Patterson
Spark Summit
 

Similar to .NET for Azure Synapse (and viceversa) (20)

Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Michael Rys
 
.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa
Marco Parenzan
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Michael Rys
 
Azure Databricks - An Introduction 2019 Roadshow.pptx
Azure Databricks - An Introduction 2019 Roadshow.pptxAzure Databricks - An Introduction 2019 Roadshow.pptx
Azure Databricks - An Introduction 2019 Roadshow.pptx
pascalsegoul
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
Nathan Bijnens
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要
Paulo Gutierrez
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
Mark Tabladillo
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
akira-ai
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
Trivadis
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
Timothy Spann
 
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Scala and Spark are Ideal for Big Data - Data Science Pop-up SeattleScala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Domino Data Lab
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptx
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptxMicrosoft Fabric Certication Course | Microsoft Fabric Training.pptx
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptx
TalluriRenuka
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code Workshop
Amanda Casari
 
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
Timothy McAliley
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
Trivadis
 
Data Engineering A Deep Dive into Databricks
Data Engineering A Deep Dive into DatabricksData Engineering A Deep Dive into Databricks
Data Engineering A Deep Dive into Databricks
Knoldus Inc.
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
James Serra
 
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Michael Rys
 
.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa
Marco Parenzan
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Michael Rys
 
Azure Databricks - An Introduction 2019 Roadshow.pptx
Azure Databricks - An Introduction 2019 Roadshow.pptxAzure Databricks - An Introduction 2019 Roadshow.pptx
Azure Databricks - An Introduction 2019 Roadshow.pptx
pascalsegoul
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
Nathan Bijnens
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要
Paulo Gutierrez
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
Mark Tabladillo
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
akira-ai
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
Trivadis
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
Timothy Spann
 
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Scala and Spark are Ideal for Big Data - Data Science Pop-up SeattleScala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Domino Data Lab
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptx
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptxMicrosoft Fabric Certication Course | Microsoft Fabric Training.pptx
Microsoft Fabric Certication Course | Microsoft Fabric Training.pptx
TalluriRenuka
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code Workshop
Amanda Casari
 
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020NOVA SQL User Group - Azure Synapse Analytics Overview -  May 2020
NOVA SQL User Group - Azure Synapse Analytics Overview - May 2020
Timothy McAliley
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
Trivadis
 
Data Engineering A Deep Dive into Databricks
Data Engineering A Deep Dive into DatabricksData Engineering A Deep Dive into Databricks
Data Engineering A Deep Dive into Databricks
Knoldus Inc.
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
James Serra
 
Ad

More from Marco Parenzan (20)

Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineer
Marco Parenzan
 
Azure Hybrid @ Home
Azure Hybrid @ HomeAzure Hybrid @ Home
Azure Hybrid @ Home
Marco Parenzan
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Marco Parenzan
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT Solutions
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .net
Marco Parenzan
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and Azure
Marco Parenzan
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT Central
Marco Parenzan
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame
Marco Parenzan
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Marco Parenzan
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data Solutions
Marco Parenzan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan
 
Azure IoT Central
Azure IoT CentralAzure IoT Central
Azure IoT Central
Marco Parenzan
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Marco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan
 
Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineer
Marco Parenzan
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Marco Parenzan
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT Solutions
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Marco Parenzan
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .net
Marco Parenzan
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and Azure
Marco Parenzan
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT Central
Marco Parenzan
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame
Marco Parenzan
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Marco Parenzan
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data Solutions
Marco Parenzan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Marco Parenzan
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Marco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
Marco Parenzan
 
Ad

Recently uploaded (20)

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)
Andre Hora
 
PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025PDF Reader Pro Crack Latest Version FREE Download 2025
PDF Reader Pro Crack Latest Version FREE Download 2025
mu394968
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& ConsiderationsDesigning AI-Powered APIs on Azure: Best Practices& Considerations
Designing AI-Powered APIs on Azure: Best Practices& Considerations
Dinusha Kumarasiri
 
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMeet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate
Maxim Salnikov
 
Kubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptxKubernetes_101_Zero_to_Platform_Engineer.pptx
Kubernetes_101_Zero_to_Platform_Engineer.pptx
CloudScouts
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Maxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINKMaxon CINEMA 4D 2025 Crack FREE Download LINK
Maxon CINEMA 4D 2025 Crack FREE Download LINK
younisnoman75
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Solidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license codeSolidworks Crack 2025 latest new + license code
Solidworks Crack 2025 latest new + license code
aneelaramzan63
 
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfMicrosoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf
TechSoup
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 

.NET for Azure Synapse (and viceversa)

  • 1. Marco Parenzan marco [dot] parenzan [at] 1nn0va [dot] it www.marcoparenzan.com
  • 2. Marco Parenzan Solution Sales Specialist in Insight for Digital Innovation Azure MVP Community Lead 1nn0va // Pordenone Linkedin: https://www.linkedin.com/in/marcoparenzan/
  • 3. Road to be a newbie “Data *” for a seasoned C# dev in a Pythonic world
  • 4. Data Processing in a Azure World... • Thousands of IoT sensors in a factory, producing petabytes of data • Started with Stream Analytics... • Like Azure Data Explorer... • Data Warehouse... • ...but there is a standard? Yes!
  • 5. Apache Spark The data c ompute exper ienc e
  • 6. Spark Unifies:  Batch Processing  Interactive SQL  Real-time processing  Machine Learning  Deep Learning  Graph Processing An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark Structured Streaming Stream processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation http://spark.apache.org Apache Spark
  • 7. Data Sources (HDFS, SQL, NoSQL, …) Cluster Manager Node Node Node Cache Cache Cache Driver Program SparkContext General Spark Cluster Architecture • ‘Driver’ runs the user’s ‘main’ function and executes the various parallel operations on the worker nodes. • The results of the operations are collected by the driver • The worker nodes read and write data from/to Data Sources including HDFS. • Worker node also cache transformed data in memory as RDDs (Resilient Data Sets). • Worker nodes and the Driver Node execute as VMs in public clouds (AWS, Google and Azure).
  • 8. Elements of Spark User’s Program Spark Session Task1 Task 2 Task 3 Task 4 Driver Executor Cluster Manager
  • 9. Read from HDFS Write to HDFS Read from HDFS Write to HDFS Read from HDFS What makes Spark fast
  • 10. DataFrame as the core of Spark • Recipe: • Create Session • Create Dataframe • Define a user defined function • Manipulate and view Data CSV Data JSON Data RDBMS Data Parquet Data Binary Data DataFrame User programs against the DataFrame abstraction
  • 11. What about .NET? • In a recent survey, more than 70% of .NET devs expressed interest in Apache Spark • Millions of lines of big data-usable business logic are written in .NET • But .NET devs are locked out from big data processing – lack of .NET support in OSS big data solutions • We want a first-class .net experience in Spark
  • 12. .NET for Apache Spark • .NET bindings (C# e F#) to Spark • Written on the Spark interop layer, designed to provide high performance bindings to multiple languages • Re-use knowledge, skills, code you have as a .NET developer • Compliant with .NET Standard • You can use .NET for Apache Spark anywhere you write .NET code • Original project Moebius • https://github.com/microsoft/Mobius
  • 13. .NET Spark support Spark DataFramews with SparkSQL • Spark 2.3.x, 2.4.x, 3.0 • ~300 SparkSQL function • DeltaLake .NET Standard 2.0 • C#/F# • .NET Framework 4.6.1+ • .NET Core 2.1+ Batch&Streaming • Structured Streaming Data Science • ML.NET • Notebooks
  • 14. Using .NET for Spark • Get started with .NET for Apache Spark | Microsoft Docs • https://docs.microsoft.com/en- us/dotnet/spark/tutorials/get- started?tabs=windows • Install .NET • Install Java • Install Apache Spark • Install .NET for Apache Spark • Create your app • Install NuGet package
  • 15. Batch vs. Notebooks • Work on slow data stored into a Datalake • Submit a complete app in one single deploy • Receive the entire output • Use «spart-submit» • Run cell by cell (but also all at once)
  • 17. Evolution of REPL • At the beginning there was mono • Then Dynamic/DLR (C# 4) • C#/F# interactive • .NET Try • In a world of: • Python • Mathematica
  • 18. Jupyter • Evolution and generalization of the seminal role of Mathematica (notebook) • +Python adoption (ipynb) • +Web (HTTP+Markdown) • +Kernel
  • 19. Demo
  • 21. Run Spark with .NET in a container • Not an official Microsoft image • https://hub.docker.com/r/3rdman/dotnet-spark • Install nothing on your machine other than docker • Launch and Debug your code from Visual Studio and Visual Studio Code • Very good for development
  • 22. Demo
  • 24. Engine for business-changing insights with seamless ecosystem integration Azure Synapse Analytics Data integration Data warehousing Big data processing Azure Data Lake Storage + Common Data Model
  • 25. Azure Synapse Analytics Limitless analytics service with unmatched time to insight Synapse Analytics Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics Data lake integrated and Common Data Model aware METASTORE SECURITY MANAGEMENT MONITORING Integrated platform services for, management, security, monitoring, and meta-store DATA INTEGRATION Analytics Runtimes Integrated analytics runtimes available provisioned and serverless on-demand Synapse SQL offering T-SQL for batch, streaming and interactive processing Apache Spark for big data processing with Python, Scala and .NET PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Multiple languages suited to different analytics workloads Experience Synapse Analytics Studio SaaS developer experiences for code free and code first Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence Designed for analytics workloads at any scale METASTORE SECURITY MANAGEMENT MONITORING
  • 26. Manage – Apache Spark pools • Overview • Provides ability to Pause, Scale, Assign Tags and upload packages from Studio.
  • 29. Synapse Service Job Service Frontend Spark API Controller … Job Service Backend Spark Plugin Gateway Resource Provider DB Synapse Studio AAD Auth Service Instance Creation Service DBDB Azure Spark Instance VM VM VM VM VM … VM Synapse Job Service
  • 30. Develop Hub - Notebooks • Notebooks • Allows to write multiple languages in one notebook • %%<Name of language> • Offers use of temporary tables across languages • Language support for Syntax highlight, syntax error, syntax code completion, smart indent, code folding • Export results
  • 31. Demo
  • 33. Add a footer 33 Conclusion • Spark is here to stay • Now it is a place for .NET skills too • Azure Synapse Analytics the best fusion between the old and the new world