SlideShare a Scribd company logo
Hands-On with U-SQL and
Azure Data Lake Analytics
(ADLA)
A first look at U-SQL on Azure Data Lake Public Preview
Jason Brugger (@JasonLBrugger)
MCSE: Data Platform, MCSE: Business
Intelligence
July 16, 2016
This presentation has been
modified from its original format.
Animations have been removed
and it has been reformatted for
publication on Slideshare.net.
Assumptions
• You are familiar with the differences between a traditional
RDBMS and a Big Data solution.
• You are familiar with both T-SQL and C#.
What is a data lake? What are Azure Data
Lake Store and Azure Data Lake Analytics?
• A data lake is a storage repository that holds a vast amount of raw
data in its native format until it is needed. – Margaret Rouse (on
AWS)
• Pentaho CTO James Dixon has generally been credited with coining
the term “data lake”. He describes a data mart (a subset of a data
warehouse) as akin to a bottle of water…” cleansed, packaged and
structured for easy consumption” while a data lake is more like a
body of water in its natural state. – Chris Campbell, Blue Granite
• Data Lake Analytics is an Azure Big Data computation service that
lets you use data to drive your business using the insights gained
from your data in the cloud, regardless of where it is and regardless
of its size. – Ed Macauley, Microsoft
Data Lake
Store
Data Lake
Analytics
ADLA vs. HDInsight (e.g. Hadoop)
• HDInsight (Cluster as a
Service)
• Provision cluster of n nodes
• Run your queries
• Delete cluster
• (Repeat)
• ADLA (Query as a service)
• Don’t provision anything
• Specify node count (parallelism)
at job submission time
• Pay per query
Getting Started – What’s Needed?
• Azure subscription
• Sign-up for ADL preview
• Visual Studio 2015 + Azure
Data Lake Tools for Visual
Studio
• Microsoft Azure PowerShell
(1.0+ via WPI)
• Not to be confused with the
version of PowerShell, e.g.
5.0.
• Microsoft Azure SDK for .NET
(Optional)
Tools and Navigation
• Azure Portal
Tools and Navigation
• Azure Portal
• Visual Studio
• Server Explorer
Tools and Navigation
• Azure Portal
• Visual Studio
• Server Explorer
• Project Templates
• PowerShell
• SDKs
• C#
• Node.js
Getting data into ADL
• Portal
Getting data into ADL
• Portal
• PowerShell
• Login-AzureRmAccount
• Import-AzureRmDataLakeStoreItem
• Connecting to External Data
(Demo #2)
• SSIS
• ADF
The Data (NOAA Weather observations)
Station Datekey Element Value Mflag Qflag Sflag TimeKey
US1FLSL0019 20150101 PRCP 173 N
US1TXTV0133 20150101 PRCP 119 N
USC00178998 20150101 TMAX -33 700
USC00178998 20150101 TMIN -167 700
USC00178998 20150101 TOBS -67 700
USC00178998 20150101 PRCP 0 700
USC00178998 20150101 SNOW 0
USC00178998 20150101 SNWD 0
USR0000CSNR 20150101 TMAX 194 H D U
Notice sparse
data with many
null values
The Data (NOAA Weather observations)
Station Datekey Element Value Mflag Qflag Sflag TimeKey
US1FLSL0019 20150101 PRCP 173 N
US1TXTV0133 20150101 PRCP 119 N
USC00178998 20150101 TMAX -33 700
USC00178998 20150101 TMIN -167 700
USC00178998 20150101 TOBS -67 700
USC00178998 20150101 PRCP 0 700
USC00178998 20150101 SNOW 0
USC00178998 20150101 SNWD 0
USR0000CSNR 20150101 TMAX 194 H D U
Multiple
observation types,
per site, per day
The Data (NOAA Weather observations)
Station Datekey Element Value Mflag Qflag Sflag TimeKey
US1FLSL0019 20150101 PRCP 173 N
US1TXTV0133 20150101 PRCP 119 N
USC00178998 20150101 TMAX -33 700
USC00178998 20150101 TMIN -167 700
USC00178998 20150101 TOBS -67 700
USC00178998 20150101 PRCP 0 700
USC00178998 20150101 SNOW 0
USC00178998 20150101 SNWD 0
USR0000CSNR 20150101 TMAX 194 H D U
Tenths of degree C
The Data (NOAA Weather observations)
Station Datekey Element Value Mflag Qflag Sflag TimeKey
US1FLSL0019 20150101 PRCP 173 N
US1TXTV0133 20150101 PRCP 119 N
USC00178998 20150101 TMAX -33 700
USC00178998 20150101 TMIN -167 700
USC00178998 20150101 TOBS -67 700
USC00178998 20150101 PRCP 0 700
USC00178998 20150101 SNOW 0
USC00178998 20150101 SNWD 0
USR0000CSNR 20150101 TMAX 194 H D U
Correlates to external data uploaded to Azure SQL Database
Basic U-SQL query
Load .csv file from
Data Lake Store
using built-in
Extractor
Schematize using
C# data types, note
nullability
Output
schematized rows
to a table variable
SELECT using
familiar SQL-like
queryOutput query result
to Data Lake Store
using built-in
Outputter
Key Takeaways & ‘gotchas’
• SQL statements MUST be uppercase
• Header rows not currently supported by Extractor
• e.g skipFirstNRows:1 not currently supported
• Be mindful of nullability in C# types
• Built-in operators include support for .Csv(), .Tsv(), & .Text()
• Various options such as delimiter
• Build custom extractors by inheriting IExtractor
DEMO #1
• Demo local execution
• Simple aggregation of 10,000 rows down to 43, by element
type
Persisted schema with meta data object
model
Familiar CREATE
DATABASE
statement
Familiar
CREATE VIEW
statement;
View maintains
extractor and
schema
definitions so
from now on,
we can just
select from the
view.
Note data kept in its
native compressed
(.gz) format.
Extractor handles
decompression in
this case Wildcard {*} yields file-
set of all matching
files
Combining with external data
• Create catalog secret using
PowerShell (specifies remote Host &
credentials & ADLA catalog)
• New-AzureRmDataLakeAnalyticsCatalogSecret
• Create credential (in turn, references
catalog secret)
• Create data source (in turn, references
credential & specifies data source
type (e.g. Azure SQL Db) & specifies
remote catalog)
CREATE EXTERNAL
TABLE denotes
underlying table
resides remotely
Schema
using C#
types
Remote table name
External data with federated query
SELECT FROM
EXTERNAL data
source EXECUTE
Embedded query executes
remotely at data source. This is
T-SQL, not U-SQL
External data with federated query
Embedded query executes
remotely at data source. This is
T-SQL, not U-SQL
Table variable
contains only rows
returned
id name
US1FLHB0090 TAMPA 10.2 NNW
US1FLHB0048 GREATER NORTHDALE 0.4 ENE
USW00012810 MACDILL AFB
USC00088890 TEMPLE TERRACES
US1FLHB0007 TAMPA 8.4 NW
US1FLHB0025 CARROLLWOOD 1.7 SE
US1FLHB0040 UNIVERSITY WEST 2.0 WNW
USC00088786 TAMPA
US1FLHB0028 WEST PARK 0.4 S
US1FLHB0055 TAMPA 5.0 NNE
US1FLHB0012 CARROLLWOOD 0.5 WNW
US1FLHB0096 TAMPA 5.4 SSW
US1FLHB0005 CITRUS PARK 1.3 ENE
USC00080520 BAY LAKE
US1FLHB0093 TEMPLE TERRACE 1.5 SE
US1FLHB0039 CARROLLWOOD 2.0 SSE
US1FLHB0087 TAMPA 7.9 N
US1FLHB0071 TAMPA 6.1 N
USW00012842 TAMPA INTL AP
US1FLHB0010 TAMPA 5.1 S
US1FLHB0036 TAMPA 4.4 SSW
US1FLHB0029 TAMPA 6.5 NNE
US1FLHB0064 TAMPA 4.7 NW
US1FLHB0051 LUTZ 2.2 SSE
Data relationship exhibit
Azure SQL
Db
Federate
d Query
dbo.station
dbo.calend
ar
Azure Data Lake Analytics
@station_t
pa
dbo.calend
ar
dbo.observ
-ation
Result
Azure Data Lake
Store
Complex types in U-SQL & EXPLODE
• SQL.ARRAY
• Like a List or Array in C#
• Can be used in conjunction with
String.Split()
• SQL.MAP
• Key-Value pairs
• Like a Dictionary (Hash table) in
C#
• EXPLODE
• Expands to rows
ID (int) Data (SQL.MAP)
1 ((“A”,25),(“B”,35),(“C”,45))
2 ((“A”,27),(“B”,38),(“C”,42))
ID
1 A 25
1 B 35
1 C 45
2 A 27
2 B 38
2 C 42
EXPLODE
Tying it all together with U-SQL
Familiar
JOIN syntax;
Note double
equals “==“,
the only
supported
JOIN
operatorFamiliar
WHERE
and
GROUP BY
syntax
Tying it all together with U-SQL
CROSS APPLY
the value
(recall this was
in tenths of
degrees C)
Tying it all together with U-SQL
CROSS APPLY
the value
(recall this was
in tenths of
degrees C)
Declare a
new
SQL.MAP
with
conversion
factors by
C, F, and K
Tying it all together with U-SQL
CROSS APPLY
the value
(recall this was
in tenths of
degrees C)
Declare a
new
SQL.MAP
with
conversion
factors by
C, F, and KEXPLODE the
SQL.MAP into
rows and new
columns
scale, temp
Tying it all together with U-SQL
Familiar
aggregation
AVG using
exploded
column temp
Tying it all together with U-SQL
Using
String.Concat .NET
method to build
description of
derived column
e.g. “AVG_TMAX_F”
AVG_TMAX_C
AVG_TMAX_K
AVG_TMIN_F
AVG_TMIN_C
AVG_TMIN_K
Tying it all together with U-SQL
CREATE TABLE AS
SELECT (CTAS) –
Conceptually
similar to select
into
Tying it all together with U-SQL
CREATE TABLE AS
SELECT (CTAS) –
Conceptually
similar to select
into
No HEAPs in
ADLA;
Clustered
index
required
Tying it all together with U-SQL
CREATE TABLE AS
SELECT (CTAS) –
Conceptually
similar to select
into
No HEAPs in
ADLA;
Clustered
index
required
Partitioned by
Round Robin
distributes data
evenly.
Tying it all together with U-SQL
CREATE TABLE AS
SELECT (CTAS) –
Conceptually
similar to select
into
No HEAPs in
ADLA;
Clustered
index
required
Partitioned by
Round Robin
distributes data
evenly.
No update or
merge
support
DEMO #2
• Data Lake data set consists of daily readings from 98,035 stations over
5 years
• ~32,720,048 rows per file
• About 164M rows total
• 24 Tampa stations
• Filtering and aggregating it down to 5 years x 12 months x 2 elements x
3 temperature scales, or 360 rows
• Monitor job execution status
• Streams, Vertices, Display avg execution time (heat map), Diagnostics,
History, Script*
Working with Assemblies & Libraries
• Code-behind file
• Convenient, simple
• Assembly created and referenced
automatically
• No support for NuGet, but manually add
references…OR:
• Class library
• Right-click, register assembly to ADLA
• Option to automatically copy to DLS
• NuGet supported normally
Example: Simple Linear Regression to
predict temp
.dlls for
statistics
library copied
to Data Lake
Store
CREATE
ASSEMBLY from
file;
(Can also create
from binary)
Custom C# class
method signature
Noaa.Predict.Regress(
int, SqlMap<int,
decimal?>) : decimal?
Example: Simple Linear Regression to
predict temp
.dlls for
statistics
library copied
to Data Lake
Store
CREATE
ASSEMBLY from
file;
(Can also create
from binary)
Custom C# class
method signature
Noaa.Predict.Regress(
int, SqlMap<int,
decimal?>) : decimal?
MAP_AGG()
function returns
SQL.MAP – like a
reverse
EXPLODE, which
we pass as
function
parameter
MAP_AGG() Exhibit
Month Year Avg Temp
1 2011 34
1 2012 36
1 2013 33
1 2014 35
1 2015 37
2 2011 41
2 2012 39
…
12 2015 26
.CS code-behind file
Year to
predict,
e.g.
2016
Referenced
Library
namespace
.CS code-behind file
Return
predicte
d Temp
Year to
predict,
e.g.
2016
SqlMap
contains
series against
which
regression is
performed
Referenced
Library
namespace
DEMO #3
• Pivoting our existing averages on Month & aggregating Year &
Temp into Key-Value pairs (SQL.MAP) which we pass as
parameter to custom function
• Passing predictive year (2016) as a parameter
• Limit our selection to just AVG_TMAX_F
• Result adds another 12 rows of predicted temps to our existing
360 row result table
Additional homework subjects, not covered
• Extractor (UDO) by inheriting IExtractor
• IOutputter
• IProcessor – transform single row, read one,
output one
• IReducer – read n rows, output 1 row
• ICombiner – like a user-defined Join
• IApplier – input one row, output n rows
• User-defined Aggregators (IAggregate) – AGG
keyword
• ARRAY_AGG()
• Blob as External storage
• No Primary Keys
• No columnstore (yet)
• Table Value Functions - YES, but not with cross
apply
• No support for R, but leverage .NET libraries as
demo’d
• User-defined Types
• Partitioning by Hash, Direct Hash, Range
• http://www.slideshare.net/MichaelRys/usql-
partitioned-data-and-tables-sqlbits-2016
Reference
• Code
• https://github.com/SQL-Jason/NOAA_USQL_Demo.git
• Data
• http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/by_year/
• Blog
• http://jasonbrugger.wordpress.com
Attribution
The Accord.NET Framework
Copyright (c) 2009-2014, César Roberto de Souza
<cesarsouza@gmail.com>
This library is free software; you can redistribute it and/or modify it under
the terms of the GNU Lesser General Public License as published by the
Free Software Foundation; either version 2.1 of the License, or (at your
option) any later version.
The copyright holders provide no reassurances that the source code
provided does not infringe any patent, copyright, or any other intellectual
property rights of third parties. The copyright holders disclaim any liability
to any recipient for claims brought against recipient by any third party
for infringement of that parties intellectual property rights.
This library is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
License for more details.
National Oceanic and Atmospheric Administration (NOAA)
README FILE FOR DAILY GLOBAL HISTORICAL CLIMATOLOGY NETWORK
(GHCN-DAILY) Version 3.22
How to cite:
Note that the GHCN-Daily dataset itself now has a DOI (Digital Object
Identifier)so it may be relevant to cite both the methods/overview journal
article as well as the specific version of the dataset used.
The journal article describing GHCN-Daily is:Menne, M.J., I. Durre, R.S.
Vose, B.E. Gleason, and T.G. Houston, 2012: An overview of the Global
Historical Climatology Network-Daily Database. Journal of Atmospheric
and Oceanic Technology, 29, 897-910, doi:10.1175/JTECH-D-11-
00103.1.
To acknowledge the specific version of the dataset used, please cite:Menne,
M.J., I. Durre, B. Korzeniewski, S. McNeal, K. Thomas, X. Yin, S. Anthony, R.
Ray, R.S. Vose, B.E.Gleason, and T.G. Houston, 2012: Global Historical
Climatology Network - Daily (GHCN-Daily), Version 3. [indicate subset used
following decimal, e.g. Version 3.12].
NOAA National Climatic Data Center. http://doi.org/10.7289/V5D21VHZ
[access date].
Bibliography
• Campbell, C. “Top Five Differences between Data Lakes and Data Warehouses.” Business Insights. Blue Granite, 26 Jan 2015. Web. https://www.blue-
granite.com/blog/bid/402596/Top-Five-Differences-between-Data-Lakes-and-Data-Warehouses
• Gopalan, R. (21 Jun 2016). U-SQL Part 4: Use custom code to extend U-SQL [Webinar]. PASS Big Data Virtual Chapter.
• Macauley, E. “Overview of Microsoft Azure Data Lake Analytics.” Microsoft Azure. Microsoft, 16 May 2016. Web. https://azure.microsoft.com/en-
us/documentation/articles/data-lake-analytics-overview/
• Reddy, S. (31 May 2016). Introduction to Azure Data Lake [Webinar]. PASS Big Data Virtual Chapter.
• Rossello, Justin. “Querying Azure SQL Database from an Azure Data Lake Analytics U-SQL Script.” eat{Code}live. 21 Nov 2015. Web.
http://eatcodelive.com/2015/11/21/querying-azure-sql-database-from-an-azure-data-lake-analytics-u-sql-script/
• Rouse, M. “Definition Data Lake.” SearchAws. TechTarget, May 2015. Web. http://searchaws.techtarget.com/definition/data-lake
• Rys, M. (8 Mar 2016). Introducing U-SQL; Part 2 of 2: Scaling U-SQL and doing SQL in U-SQL [Webinar]. PASS Big Data Virtual Chapter. Retrieved from
http://www.youtube.com/channel/UCkOKmMW_LEsACOqE8C1RWdw
• Rys, M. (16 Feb 2016). Introducing U-SQL; Part 1 of 2: Introduction and C# extensibility [Webinar]. PASS Big Data Virtual Chapter. Retrieved from
http://www.youtube.com/channel/UCkOKmMW_LEsACOqE8C1RWdw
• Rys, M., et. al. Azure/usql, (2016), GitHub repository, https://github.com/Azure/usql
• “U-SQL Language Reference.” Microsoft Azure. Microsoft, 28 Oct 2015. Web. https://msdn.microsoft.com/en-
US/library/azure/mt591959(Azure.100).aspx
Ad

More Related Content

What's hot (20)

U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
Michael Rys
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
Michael Rys
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Michael Rys
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
Michael Rys
 
Azure data lake sql konf 2016
Azure data lake   sql konf 2016Azure data lake   sql konf 2016
Azure data lake sql konf 2016
Kenneth Michael Nielsen
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
MS Cloud Summit
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
Michael Rys
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Michael Rys
 
Using C# with U-SQL (SQLBits 2016)
Using C# with U-SQL (SQLBits 2016)Using C# with U-SQL (SQLBits 2016)
Using C# with U-SQL (SQLBits 2016)
Michael Rys
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
Michael Rys
 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code ExamplesSpark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
Todd McGrath
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Michael Rys
 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Julian Hyde
 
Discardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With HadoopDiscardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With Hadoop
Julian Hyde
 
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
U-SQL Killer Scenarios: Custom Processing, Big Cognition, Image and JSON Proc...
Michael Rys
 
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
U-SQL Killer Scenarios: Taming the Data Science Monster with U-SQL and Big Co...
Michael Rys
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Michael Rys
 
U-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
Michael Rys
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
MS Cloud Summit
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
Michael Rys
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
Michael Rys
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Michael Rys
 
Using C# with U-SQL (SQLBits 2016)
Using C# with U-SQL (SQLBits 2016)Using C# with U-SQL (SQLBits 2016)
Using C# with U-SQL (SQLBits 2016)
Michael Rys
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQLTaming the Data Science Monster with A New ‘Sword’ – U-SQL
Taming the Data Science Monster with A New ‘Sword’ – U-SQL
Michael Rys
 
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
Michael Rys
 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code ExamplesSpark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
Todd McGrath
 
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Big Data Processing with Spark and .NET - Microsoft Ignite 2019
Michael Rys
 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Julian Hyde
 
Discardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With HadoopDiscardable In-Memory Materialized Queries With Hadoop
Discardable In-Memory Materialized Queries With Hadoop
Julian Hyde
 

Viewers also liked (20)

Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)
Michael Rys
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys
 
Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQL
Michael Rys
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
Idan Tohami
 
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature MappingMicrosoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Ilyas F ☁☁☁
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
Michael Rys
 
Open stack design 2012 applications targeting openstack-final
Open stack design 2012   applications targeting openstack-finalOpen stack design 2012   applications targeting openstack-final
Open stack design 2012 applications targeting openstack-final
rhirschfeld
 
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product EngineeringInevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Prashanth Panduranga
 
Data Migration and Data-Tier Applications with SQL Azure
Data Migration and Data-Tier Applications with SQL AzureData Migration and Data-Tier Applications with SQL Azure
Data Migration and Data-Tier Applications with SQL Azure
Mark Kromer
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
rhirschfeld
 
Diese wichtigen britischen straßenschilder sollten sie kennen
Diese wichtigen britischen straßenschilder sollten sie kennenDiese wichtigen britischen straßenschilder sollten sie kennen
Diese wichtigen britischen straßenschilder sollten sie kennen
Jean-Yves Scauri
 
BP Project History
BP Project HistoryBP Project History
BP Project History
Wesley Cardno
 
SaaS and Multi-Tenancy – Foundational Concepts
SaaS and Multi-Tenancy – Foundational ConceptsSaaS and Multi-Tenancy – Foundational Concepts
SaaS and Multi-Tenancy – Foundational Concepts
Jeelani Shaik
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS Applications
Expertos en TI
 
Put Your Existing Application On Windows Azure
Put Your Existing Application On Windows AzurePut Your Existing Application On Windows Azure
Put Your Existing Application On Windows Azure
Maarten Balliauw
 
Webinar - Business Implications of SaaS Multi Tenancy
Webinar - Business Implications of SaaS Multi TenancyWebinar - Business Implications of SaaS Multi Tenancy
Webinar - Business Implications of SaaS Multi Tenancy
ScioSales
 
Enterprise Agreement
Enterprise AgreementEnterprise Agreement
Enterprise Agreement
Sagi Arsyad
 
Microsoft Software Assurance
Microsoft Software AssuranceMicrosoft Software Assurance
Microsoft Software Assurance
Motty Ben Atia
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)U-SQL Federated Distributed Queries (SQLBits 2016)
U-SQL Federated Distributed Queries (SQLBits 2016)
Michael Rys
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys
 
Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
Killer Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQLKiller Scenarios with Data Lake in Azure with U-SQL
Killer Scenarios with Data Lake in Azure with U-SQL
Michael Rys
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
Idan Tohami
 
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature MappingMicrosoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Microsoft Azure vs Amazon Web Services (AWS) Services & Feature Mapping
Ilyas F ☁☁☁
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
Michael Rys
 
Open stack design 2012 applications targeting openstack-final
Open stack design 2012   applications targeting openstack-finalOpen stack design 2012   applications targeting openstack-final
Open stack design 2012 applications targeting openstack-final
rhirschfeld
 
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product EngineeringInevitability of Multi-Tenancy & SAAS in Product Engineering
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Prashanth Panduranga
 
Data Migration and Data-Tier Applications with SQL Azure
Data Migration and Data-Tier Applications with SQL AzureData Migration and Data-Tier Applications with SQL Azure
Data Migration and Data-Tier Applications with SQL Azure
Mark Kromer
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
rhirschfeld
 
Diese wichtigen britischen straßenschilder sollten sie kennen
Diese wichtigen britischen straßenschilder sollten sie kennenDiese wichtigen britischen straßenschilder sollten sie kennen
Diese wichtigen britischen straßenschilder sollten sie kennen
Jean-Yves Scauri
 
SaaS and Multi-Tenancy – Foundational Concepts
SaaS and Multi-Tenancy – Foundational ConceptsSaaS and Multi-Tenancy – Foundational Concepts
SaaS and Multi-Tenancy – Foundational Concepts
Jeelani Shaik
 
EEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS ApplicationsEEDC 2010. Scaling SaaS Applications
EEDC 2010. Scaling SaaS Applications
Expertos en TI
 
Put Your Existing Application On Windows Azure
Put Your Existing Application On Windows AzurePut Your Existing Application On Windows Azure
Put Your Existing Application On Windows Azure
Maarten Balliauw
 
Webinar - Business Implications of SaaS Multi Tenancy
Webinar - Business Implications of SaaS Multi TenancyWebinar - Business Implications of SaaS Multi Tenancy
Webinar - Business Implications of SaaS Multi Tenancy
ScioSales
 
Enterprise Agreement
Enterprise AgreementEnterprise Agreement
Enterprise Agreement
Sagi Arsyad
 
Microsoft Software Assurance
Microsoft Software AssuranceMicrosoft Software Assurance
Microsoft Software Assurance
Motty Ben Atia
 
Ad

Similar to Hands-On with U-SQL and Azure Data Lake Analytics (ADLA) (20)

TSQL in SQL Server 2012
TSQL in SQL Server 2012TSQL in SQL Server 2012
TSQL in SQL Server 2012
Eduardo Castro
 
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
Jürgen Ambrosi
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
John Kanagaraj
 
My SYSAUX tablespace is full - please help
My SYSAUX tablespace is full - please helpMy SYSAUX tablespace is full - please help
My SYSAUX tablespace is full - please help
Markus Flechtner
 
USQ Landdemos Azure Data Lake
USQ Landdemos Azure Data LakeUSQ Landdemos Azure Data Lake
USQ Landdemos Azure Data Lake
Trivadis
 
Rmoug ashmaster
Rmoug ashmasterRmoug ashmaster
Rmoug ashmaster
Kyle Hailey
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the Indices
Kellyn Pot'Vin-Gorman
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
udaymoogala
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streams
Radu Tudoran
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
Julian Hyde
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
Mahesh Vallampati
 
SQL Windowing
SQL WindowingSQL Windowing
SQL Windowing
Sandun Perera
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
Guy Harrison
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
ScyllaDB
 
Data Time Travel by Delta Time Machine
Data Time Travel by Delta Time MachineData Time Travel by Delta Time Machine
Data Time Travel by Delta Time Machine
Databricks
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19c
RachelBarker26
 
TSQL in SQL Server 2012
TSQL in SQL Server 2012TSQL in SQL Server 2012
TSQL in SQL Server 2012
Eduardo Castro
 
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
2° Ciclo Microsoft CRUI 3° Sessione: l'evoluzione delle piattaforme tecnologi...
Jürgen Ambrosi
 
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and AdvisorsYour tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
John Kanagaraj
 
My SYSAUX tablespace is full - please help
My SYSAUX tablespace is full - please helpMy SYSAUX tablespace is full - please help
My SYSAUX tablespace is full - please help
Markus Flechtner
 
USQ Landdemos Azure Data Lake
USQ Landdemos Azure Data LakeUSQ Landdemos Azure Data Lake
USQ Landdemos Azure Data Lake
Trivadis
 
Oracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the IndicesOracle vs. SQL Server- War of the Indices
Oracle vs. SQL Server- War of the Indices
Kellyn Pot'Vin-Gorman
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward
 
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdfHailey_Database_Performance_Made_Easy_through_Graphics.pdf
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
cookie1969
 
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And WhatPerformance Tuning With Oracle ASH and AWR. Part 1 How And What
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
udaymoogala
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streams
Radu Tudoran
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
ScyllaDB
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
Mahesh Vallampati
 
Top 10 tips for Oracle performance
Top 10 tips for Oracle performanceTop 10 tips for Oracle performance
Top 10 tips for Oracle performance
Guy Harrison
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data ArchivalGPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
ScyllaDB
 
Data Time Travel by Delta Time Machine
Data Time Travel by Delta Time MachineData Time Travel by Delta Time Machine
Data Time Travel by Delta Time Machine
Databricks
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Databricks
 
SQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19cSQL Performance Tuning and New Features in Oracle 19c
SQL Performance Tuning and New Features in Oracle 19c
RachelBarker26
 
Ad

Recently uploaded (20)

183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136How to join illuminati Agent in uganda call+256776963507/0741506136
How to join illuminati Agent in uganda call+256776963507/0741506136
illuminati Agent uganda call+256776963507/0741506136
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag183409-christina-rossetti.pdfdsfsdasggsag
183409-christina-rossetti.pdfdsfsdasggsag
fardin123rahman07
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Data Analytics Overview and its applications
Data Analytics Overview and its applicationsData Analytics Overview and its applications
Data Analytics Overview and its applications
JanmejayaMishra7
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 

Hands-On with U-SQL and Azure Data Lake Analytics (ADLA)

  • 1. Hands-On with U-SQL and Azure Data Lake Analytics (ADLA) A first look at U-SQL on Azure Data Lake Public Preview Jason Brugger (@JasonLBrugger) MCSE: Data Platform, MCSE: Business Intelligence July 16, 2016
  • 2. This presentation has been modified from its original format. Animations have been removed and it has been reformatted for publication on Slideshare.net.
  • 3. Assumptions • You are familiar with the differences between a traditional RDBMS and a Big Data solution. • You are familiar with both T-SQL and C#.
  • 4. What is a data lake? What are Azure Data Lake Store and Azure Data Lake Analytics? • A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. – Margaret Rouse (on AWS) • Pentaho CTO James Dixon has generally been credited with coining the term “data lake”. He describes a data mart (a subset of a data warehouse) as akin to a bottle of water…” cleansed, packaged and structured for easy consumption” while a data lake is more like a body of water in its natural state. – Chris Campbell, Blue Granite • Data Lake Analytics is an Azure Big Data computation service that lets you use data to drive your business using the insights gained from your data in the cloud, regardless of where it is and regardless of its size. – Ed Macauley, Microsoft Data Lake Store Data Lake Analytics
  • 5. ADLA vs. HDInsight (e.g. Hadoop) • HDInsight (Cluster as a Service) • Provision cluster of n nodes • Run your queries • Delete cluster • (Repeat) • ADLA (Query as a service) • Don’t provision anything • Specify node count (parallelism) at job submission time • Pay per query
  • 6. Getting Started – What’s Needed? • Azure subscription • Sign-up for ADL preview • Visual Studio 2015 + Azure Data Lake Tools for Visual Studio • Microsoft Azure PowerShell (1.0+ via WPI) • Not to be confused with the version of PowerShell, e.g. 5.0. • Microsoft Azure SDK for .NET (Optional)
  • 8. Tools and Navigation • Azure Portal • Visual Studio • Server Explorer
  • 9. Tools and Navigation • Azure Portal • Visual Studio • Server Explorer • Project Templates • PowerShell • SDKs • C# • Node.js
  • 10. Getting data into ADL • Portal
  • 11. Getting data into ADL • Portal • PowerShell • Login-AzureRmAccount • Import-AzureRmDataLakeStoreItem • Connecting to External Data (Demo #2) • SSIS • ADF
  • 12. The Data (NOAA Weather observations) Station Datekey Element Value Mflag Qflag Sflag TimeKey US1FLSL0019 20150101 PRCP 173 N US1TXTV0133 20150101 PRCP 119 N USC00178998 20150101 TMAX -33 700 USC00178998 20150101 TMIN -167 700 USC00178998 20150101 TOBS -67 700 USC00178998 20150101 PRCP 0 700 USC00178998 20150101 SNOW 0 USC00178998 20150101 SNWD 0 USR0000CSNR 20150101 TMAX 194 H D U Notice sparse data with many null values
  • 13. The Data (NOAA Weather observations) Station Datekey Element Value Mflag Qflag Sflag TimeKey US1FLSL0019 20150101 PRCP 173 N US1TXTV0133 20150101 PRCP 119 N USC00178998 20150101 TMAX -33 700 USC00178998 20150101 TMIN -167 700 USC00178998 20150101 TOBS -67 700 USC00178998 20150101 PRCP 0 700 USC00178998 20150101 SNOW 0 USC00178998 20150101 SNWD 0 USR0000CSNR 20150101 TMAX 194 H D U Multiple observation types, per site, per day
  • 14. The Data (NOAA Weather observations) Station Datekey Element Value Mflag Qflag Sflag TimeKey US1FLSL0019 20150101 PRCP 173 N US1TXTV0133 20150101 PRCP 119 N USC00178998 20150101 TMAX -33 700 USC00178998 20150101 TMIN -167 700 USC00178998 20150101 TOBS -67 700 USC00178998 20150101 PRCP 0 700 USC00178998 20150101 SNOW 0 USC00178998 20150101 SNWD 0 USR0000CSNR 20150101 TMAX 194 H D U Tenths of degree C
  • 15. The Data (NOAA Weather observations) Station Datekey Element Value Mflag Qflag Sflag TimeKey US1FLSL0019 20150101 PRCP 173 N US1TXTV0133 20150101 PRCP 119 N USC00178998 20150101 TMAX -33 700 USC00178998 20150101 TMIN -167 700 USC00178998 20150101 TOBS -67 700 USC00178998 20150101 PRCP 0 700 USC00178998 20150101 SNOW 0 USC00178998 20150101 SNWD 0 USR0000CSNR 20150101 TMAX 194 H D U Correlates to external data uploaded to Azure SQL Database
  • 16. Basic U-SQL query Load .csv file from Data Lake Store using built-in Extractor Schematize using C# data types, note nullability Output schematized rows to a table variable SELECT using familiar SQL-like queryOutput query result to Data Lake Store using built-in Outputter
  • 17. Key Takeaways & ‘gotchas’ • SQL statements MUST be uppercase • Header rows not currently supported by Extractor • e.g skipFirstNRows:1 not currently supported • Be mindful of nullability in C# types • Built-in operators include support for .Csv(), .Tsv(), & .Text() • Various options such as delimiter • Build custom extractors by inheriting IExtractor
  • 18. DEMO #1 • Demo local execution • Simple aggregation of 10,000 rows down to 43, by element type
  • 19. Persisted schema with meta data object model Familiar CREATE DATABASE statement Familiar CREATE VIEW statement; View maintains extractor and schema definitions so from now on, we can just select from the view. Note data kept in its native compressed (.gz) format. Extractor handles decompression in this case Wildcard {*} yields file- set of all matching files
  • 20. Combining with external data • Create catalog secret using PowerShell (specifies remote Host & credentials & ADLA catalog) • New-AzureRmDataLakeAnalyticsCatalogSecret • Create credential (in turn, references catalog secret) • Create data source (in turn, references credential & specifies data source type (e.g. Azure SQL Db) & specifies remote catalog) CREATE EXTERNAL TABLE denotes underlying table resides remotely Schema using C# types Remote table name
  • 21. External data with federated query SELECT FROM EXTERNAL data source EXECUTE Embedded query executes remotely at data source. This is T-SQL, not U-SQL
  • 22. External data with federated query Embedded query executes remotely at data source. This is T-SQL, not U-SQL Table variable contains only rows returned id name US1FLHB0090 TAMPA 10.2 NNW US1FLHB0048 GREATER NORTHDALE 0.4 ENE USW00012810 MACDILL AFB USC00088890 TEMPLE TERRACES US1FLHB0007 TAMPA 8.4 NW US1FLHB0025 CARROLLWOOD 1.7 SE US1FLHB0040 UNIVERSITY WEST 2.0 WNW USC00088786 TAMPA US1FLHB0028 WEST PARK 0.4 S US1FLHB0055 TAMPA 5.0 NNE US1FLHB0012 CARROLLWOOD 0.5 WNW US1FLHB0096 TAMPA 5.4 SSW US1FLHB0005 CITRUS PARK 1.3 ENE USC00080520 BAY LAKE US1FLHB0093 TEMPLE TERRACE 1.5 SE US1FLHB0039 CARROLLWOOD 2.0 SSE US1FLHB0087 TAMPA 7.9 N US1FLHB0071 TAMPA 6.1 N USW00012842 TAMPA INTL AP US1FLHB0010 TAMPA 5.1 S US1FLHB0036 TAMPA 4.4 SSW US1FLHB0029 TAMPA 6.5 NNE US1FLHB0064 TAMPA 4.7 NW US1FLHB0051 LUTZ 2.2 SSE
  • 23. Data relationship exhibit Azure SQL Db Federate d Query dbo.station dbo.calend ar Azure Data Lake Analytics @station_t pa dbo.calend ar dbo.observ -ation Result Azure Data Lake Store
  • 24. Complex types in U-SQL & EXPLODE • SQL.ARRAY • Like a List or Array in C# • Can be used in conjunction with String.Split() • SQL.MAP • Key-Value pairs • Like a Dictionary (Hash table) in C# • EXPLODE • Expands to rows ID (int) Data (SQL.MAP) 1 ((“A”,25),(“B”,35),(“C”,45)) 2 ((“A”,27),(“B”,38),(“C”,42)) ID 1 A 25 1 B 35 1 C 45 2 A 27 2 B 38 2 C 42 EXPLODE
  • 25. Tying it all together with U-SQL Familiar JOIN syntax; Note double equals “==“, the only supported JOIN operatorFamiliar WHERE and GROUP BY syntax
  • 26. Tying it all together with U-SQL CROSS APPLY the value (recall this was in tenths of degrees C)
  • 27. Tying it all together with U-SQL CROSS APPLY the value (recall this was in tenths of degrees C) Declare a new SQL.MAP with conversion factors by C, F, and K
  • 28. Tying it all together with U-SQL CROSS APPLY the value (recall this was in tenths of degrees C) Declare a new SQL.MAP with conversion factors by C, F, and KEXPLODE the SQL.MAP into rows and new columns scale, temp
  • 29. Tying it all together with U-SQL Familiar aggregation AVG using exploded column temp
  • 30. Tying it all together with U-SQL Using String.Concat .NET method to build description of derived column e.g. “AVG_TMAX_F” AVG_TMAX_C AVG_TMAX_K AVG_TMIN_F AVG_TMIN_C AVG_TMIN_K
  • 31. Tying it all together with U-SQL CREATE TABLE AS SELECT (CTAS) – Conceptually similar to select into
  • 32. Tying it all together with U-SQL CREATE TABLE AS SELECT (CTAS) – Conceptually similar to select into No HEAPs in ADLA; Clustered index required
  • 33. Tying it all together with U-SQL CREATE TABLE AS SELECT (CTAS) – Conceptually similar to select into No HEAPs in ADLA; Clustered index required Partitioned by Round Robin distributes data evenly.
  • 34. Tying it all together with U-SQL CREATE TABLE AS SELECT (CTAS) – Conceptually similar to select into No HEAPs in ADLA; Clustered index required Partitioned by Round Robin distributes data evenly. No update or merge support
  • 35. DEMO #2 • Data Lake data set consists of daily readings from 98,035 stations over 5 years • ~32,720,048 rows per file • About 164M rows total • 24 Tampa stations • Filtering and aggregating it down to 5 years x 12 months x 2 elements x 3 temperature scales, or 360 rows • Monitor job execution status • Streams, Vertices, Display avg execution time (heat map), Diagnostics, History, Script*
  • 36. Working with Assemblies & Libraries • Code-behind file • Convenient, simple • Assembly created and referenced automatically • No support for NuGet, but manually add references…OR: • Class library • Right-click, register assembly to ADLA • Option to automatically copy to DLS • NuGet supported normally
  • 37. Example: Simple Linear Regression to predict temp .dlls for statistics library copied to Data Lake Store CREATE ASSEMBLY from file; (Can also create from binary) Custom C# class method signature Noaa.Predict.Regress( int, SqlMap<int, decimal?>) : decimal?
  • 38. Example: Simple Linear Regression to predict temp .dlls for statistics library copied to Data Lake Store CREATE ASSEMBLY from file; (Can also create from binary) Custom C# class method signature Noaa.Predict.Regress( int, SqlMap<int, decimal?>) : decimal? MAP_AGG() function returns SQL.MAP – like a reverse EXPLODE, which we pass as function parameter
  • 39. MAP_AGG() Exhibit Month Year Avg Temp 1 2011 34 1 2012 36 1 2013 33 1 2014 35 1 2015 37 2 2011 41 2 2012 39 … 12 2015 26
  • 40. .CS code-behind file Year to predict, e.g. 2016 Referenced Library namespace
  • 41. .CS code-behind file Return predicte d Temp Year to predict, e.g. 2016 SqlMap contains series against which regression is performed Referenced Library namespace
  • 42. DEMO #3 • Pivoting our existing averages on Month & aggregating Year & Temp into Key-Value pairs (SQL.MAP) which we pass as parameter to custom function • Passing predictive year (2016) as a parameter • Limit our selection to just AVG_TMAX_F • Result adds another 12 rows of predicted temps to our existing 360 row result table
  • 43. Additional homework subjects, not covered • Extractor (UDO) by inheriting IExtractor • IOutputter • IProcessor – transform single row, read one, output one • IReducer – read n rows, output 1 row • ICombiner – like a user-defined Join • IApplier – input one row, output n rows • User-defined Aggregators (IAggregate) – AGG keyword • ARRAY_AGG() • Blob as External storage • No Primary Keys • No columnstore (yet) • Table Value Functions - YES, but not with cross apply • No support for R, but leverage .NET libraries as demo’d • User-defined Types • Partitioning by Hash, Direct Hash, Range • http://www.slideshare.net/MichaelRys/usql- partitioned-data-and-tables-sqlbits-2016
  • 44. Reference • Code • https://github.com/SQL-Jason/NOAA_USQL_Demo.git • Data • http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/by_year/ • Blog • http://jasonbrugger.wordpress.com
  • 45. Attribution The Accord.NET Framework Copyright (c) 2009-2014, César Roberto de Souza <[email protected]> This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The copyright holders provide no reassurances that the source code provided does not infringe any patent, copyright, or any other intellectual property rights of third parties. The copyright holders disclaim any liability to any recipient for claims brought against recipient by any third party for infringement of that parties intellectual property rights. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. National Oceanic and Atmospheric Administration (NOAA) README FILE FOR DAILY GLOBAL HISTORICAL CLIMATOLOGY NETWORK (GHCN-DAILY) Version 3.22 How to cite: Note that the GHCN-Daily dataset itself now has a DOI (Digital Object Identifier)so it may be relevant to cite both the methods/overview journal article as well as the specific version of the dataset used. The journal article describing GHCN-Daily is:Menne, M.J., I. Durre, R.S. Vose, B.E. Gleason, and T.G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily Database. Journal of Atmospheric and Oceanic Technology, 29, 897-910, doi:10.1175/JTECH-D-11- 00103.1. To acknowledge the specific version of the dataset used, please cite:Menne, M.J., I. Durre, B. Korzeniewski, S. McNeal, K. Thomas, X. Yin, S. Anthony, R. Ray, R.S. Vose, B.E.Gleason, and T.G. Houston, 2012: Global Historical Climatology Network - Daily (GHCN-Daily), Version 3. [indicate subset used following decimal, e.g. Version 3.12]. NOAA National Climatic Data Center. http://doi.org/10.7289/V5D21VHZ [access date].
  • 46. Bibliography • Campbell, C. “Top Five Differences between Data Lakes and Data Warehouses.” Business Insights. Blue Granite, 26 Jan 2015. Web. https://www.blue- granite.com/blog/bid/402596/Top-Five-Differences-between-Data-Lakes-and-Data-Warehouses • Gopalan, R. (21 Jun 2016). U-SQL Part 4: Use custom code to extend U-SQL [Webinar]. PASS Big Data Virtual Chapter. • Macauley, E. “Overview of Microsoft Azure Data Lake Analytics.” Microsoft Azure. Microsoft, 16 May 2016. Web. https://azure.microsoft.com/en- us/documentation/articles/data-lake-analytics-overview/ • Reddy, S. (31 May 2016). Introduction to Azure Data Lake [Webinar]. PASS Big Data Virtual Chapter. • Rossello, Justin. “Querying Azure SQL Database from an Azure Data Lake Analytics U-SQL Script.” eat{Code}live. 21 Nov 2015. Web. http://eatcodelive.com/2015/11/21/querying-azure-sql-database-from-an-azure-data-lake-analytics-u-sql-script/ • Rouse, M. “Definition Data Lake.” SearchAws. TechTarget, May 2015. Web. http://searchaws.techtarget.com/definition/data-lake • Rys, M. (8 Mar 2016). Introducing U-SQL; Part 2 of 2: Scaling U-SQL and doing SQL in U-SQL [Webinar]. PASS Big Data Virtual Chapter. Retrieved from http://www.youtube.com/channel/UCkOKmMW_LEsACOqE8C1RWdw • Rys, M. (16 Feb 2016). Introducing U-SQL; Part 1 of 2: Introduction and C# extensibility [Webinar]. PASS Big Data Virtual Chapter. Retrieved from http://www.youtube.com/channel/UCkOKmMW_LEsACOqE8C1RWdw • Rys, M., et. al. Azure/usql, (2016), GitHub repository, https://github.com/Azure/usql • “U-SQL Language Reference.” Microsoft Azure. Microsoft, 28 Oct 2015. Web. https://msdn.microsoft.com/en- US/library/azure/mt591959(Azure.100).aspx

Editor's Notes

  • #8: Azure Portal – 1) Browse to create new resource, 2) Point-out default Sample data, 3) Browse my own sample.csv, 4) Point-out Upload option Visual Studio – 1) Demo Server Explorer login & browse objects, 2) Point out Upload option
  • #9: Azure Portal – 1) Browse to create new resource, 2) Point-out default Sample data, 3) Browse my own sample.csv, 4) Point-out Upload option Visual Studio – 1) Demo Server Explorer login & browse objects, 2) Point out Upload option
  • #10: Azure Portal – 1) Browse to create new resource, 2) Point-out default Sample data, 3) Browse my own sample.csv, 4) Point-out Upload option Visual Studio – 1) Demo Server Explorer login & browse objects, 2) Point out Upload option