SlideShare a Scribd company logo
CloudCamp Chicago
“Big Data and Cloud”
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Emcee
Margaret Walker

Cohesive Networks
Tweet: @CloudCamp_Chi

#cloudcamp
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
… sponsored by you!
William Knowles - Evident.io
Adam Kallish - IBM
Craig Hancock - HealthEngine
Brandon Pittman -VMware
Chuck Mackie - Maven Wave Partners
Brad Foster - Maven Wave Partners
Kim Neuwirth - Narrative Science
PiaOpulencia - Narrative Science
JimStiller - CloudTechnology Partners Networks
Brian Lickenbrock - EY
6:00 pm Introductions
6:05 pm: Lightning Talks
"Big Data without Big Infrastructure" - Dan Chuparkoff,VP of
Product at Civis Analytics @Chuparkoff
"Simplicity, Storytelling and Big Data" - Craig Booth, Data Engineer
at Narrative Science @craigmbooth
"Spark:A Quick Ignition" - Matthew Kemp,Team Lead & Engineer of
Things at Signal @mattkemp
"Building warehousing systems on Redshift" - Tristan Crockett,
Software Engineer at Edgeflip @thcrock
7:00 pm: Unpanel
7:45 pm: Unconference / Networking, drinks and pizza
Agenda
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
"Big Data without Big
Infrastructure"
Dan Chuparkoff

VP of Product at Civis Analytics
Tweet: @Chuparkoff
#cloudcamp
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
@chuparkoff
BIG Data without
BIG Infrastructure
Dan Chuparkoff
VP of Product
Civis Analytics
@chuparkoff Big Data without Big Infrastructure
Civis is an easy-to-use,
incredibly extensible data
science platform in the cloud
for teams who want to make
great data-driven decisions
to drive their organizations
forward.
I work at Civis
Big Data without Big Infrastructure@chuparkoff
“The ability to use the data
that you’ve built up in the past
to inform & improve
what you’re going to do in the future.”
Big Data at Civis Analytics
@chuparkoff Big Data without Big Infrastructure
Data science
is too damn hard
have a report every day that says
what happened yesterday?
apply predictive modeling to
improve my customer retention?
to use data from my past to
improve acquisition in the future?
Why can’t I…
?
?
?
@chuparkoff Big Data without Big Infrastructure
Everyone’s
story •  Aggregate
•  Unify
•  Explore
•  Optimize
•  Share
•  Automate
Big Data without Big Infrastructure@chuparkoff
Where should we start?
Cloud OnPrem
vs.	
  
@chuparkoff Big Data without Big Infrastructure
Civis Analytics uses AWS
@chuparkoff Big Data without Big Infrastructure
•  No hardware costs and infinitely scalable
•  Safety and security of AWS
•  Automatic backups to multiple data centers
•  Access from any computer with an internet connection
@chuparkoff Big Data without Big Infrastructure
Redshift	
   S3	
  EC2	
   DynamoDB	
   RDS	
   EMR	
  
@chuparkoff Big Data without Big Infrastructure
@chuparkoff Big Data without Big Infrastructure
@chuparkoff Big Data without Big Infrastructure
@chuparkoff Big Data without Big Infrastructure
Civis data streams
aggregate data from
virtually any source.
Get all pf your data
together in one place.
Aggregate
From data to activation
@chuparkoff Big Data without Big Infrastructure
Next, Civis’ intelligent
matching algorithms
link data in disparate
data stores. No matter
where your data starts,
Civis helps you build a
unified data repository.
Unify
From data to activation
@chuparkoff Big Data without Big Infrastructure
Explore and transform
the data in a fast
analytics database.
Explore
From data to activation
@chuparkoff Big Data without Big Infrastructure
Build powerful
predictive models and
easily score results
with the Civis
platform’s advanced
modeling engine. This is
the heart of data-
driven decision making!
Optimize
From data to activation
@chuparkoff Big Data without Big Infrastructure
Create, automate, &
share reports across
your team.
Empower your entire
organization to move
forward with precision.
Share
From data to activation
@chuparkoff Big Data without Big Infrastructure
When tomorrow comes
there’s no need to
reinvent the wheel. Civis
let’s you automate and
schedule from start to
finish, so you can get back
to pushing boundaries.
Automate
From data to activation
@chuparkoff Big Data without Big Infrastructure
Big Data + the Cloud + AWS
helps Civis Analytics turn
an analyst into a data scientist
& a data scientist
into a team of data scientists.
Thanks!
"Simplicity, Storytelling and Big
Data"
Craig Booth
Data Engineer at Narrative Science
Tweet: @craigmbooth
#cloudcamp
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Simplicity, Storytelling & Big
Data	

Craig Booth
What I Wish I Knew About Big Data On Day One.
My Background
data driven science
30+ journal articles; complex analytics on 10s of TB of data
data powered storytelling
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
lumière léger
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides
Credit: Josh Bloom Henrik Brink of wise.io
“…more than 2000 hours of work in order to come up
with the final combination of 107 algorithms that gave them
this prize”
Xavier Amatriain and Justin Basilico, Netflix
“We evaluated some of the new methods offline but the
additional accuracy gains that we measured did not seem to
justify the engineering effort needed to bring them into a
production environment.”
Xavier Amatriain and Justin Basilico, Netflix
Explainability
Im
plem
entability
AccuracyC
an
Icom
m
unicate
results?
H
ow
longw
illittake
m
e
to
build?
C
an
Itolerate
som
e
errors?
"Spark:A Quick Ignition"
Matthew Kemp
Team Lead & Engineer of Things at Signal
Tweet: @mattkemp
#cloudcamp
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Spark: A Quick
Ignition
Matthew Kemp
Provides distributed processing
Main unit of abstraction is the RDD
Can be used with frameworks like Mesos or Yarn
Supports Java, Python and Scala
https://spark.apache.org/
What is Spark?
Can be created from…
Files or HDFS
In memory iterable
Cassandra or SQL tables
Transformations
Lazily create a new RDD from an existing one
Actions
Usually return a value, force computation of
RDD
Resilient Distributed Dataset
Some examples:
filter
map
flatMap
distinct
union
intersection
join
reduceByKey
Transformations
Some examples:
reduce
collect
take
count
foreach
saveAsTextFile
Actions
Sample Text
Spark Example
Spark Shell
Shell Example
Gists
Example: Word Count
flatMap()input
reduceBy
Key()
map() outputmap()
#!/bin/python
regex = re.compile('[%s]' % re.escape(string.punctuation))
def word_count(sc, in_file_name, out_file_name):
sc.textFile(in_file_name) 
.map(lambda line: regex.sub(' ', line).strip().lower()) 
.flatMap(lambda line: [
(word, 1) for word in line.split()
]) 
.reduceByKey(lambda a, b: a + b) 
.map(lambda (word, count): '%s,%s' % (word, count)) 
.saveAsTextFile(out_file_name)
Example: Word Count
#!/bin/python
regex = re.compile('[%s]' % re.escape(string.punctuation))
def word_count(sc, in_file_name, out_file_name):
sc.textFile(in_file_name) 
.map(lambda line: regex.sub(' ', line)) 
.map(lambda line: line.strip()) 
.map(lambda line: line.lower()) 
.flatMap(lambda line: line.split()) 
.map(lambda word: (word, 1)) 
.reduceByKey(lambda a, b: a + b) 
.map(lambda (word, count): '%s,%s' % (word, count)) 
.saveAsTextFile(out_file_name)
Example: Alternate Word Count
$ pyspark
...
Using Python version 2.7.2 (default)
SparkContext available as sc.
>>> from word_count import word_count
>>> word_count(sc, 'text.txt', 'text_counts')
Running the Example
a,23
able,1
about,6
above,1
accept,1
accuse,1
ago,2
alarm,2
all,7
although,1
always,2
an,1
The Results From Spark
and,26
anger,1
another,1
any,2
anyone,1
arches,1
are,1
arm,1
armour,1
as,7
assistant,2
...
#!/bin/bash
text=$(cat ${1} | tr "[:punct:]" " " | 
tr "[:upper:]" "[:lower:]")
parsed=(${text})
for w in ${parsed[@]}; do echo ${w}; done | sort | uniq -c
A (Bad) Shell Version
23 a
1 able
6 about
1 above
1 accept
1 accuse
2 ago
2 alarm
7 all
1 although
2 always
1 an
The Results From the Shell
26 and
1 anger
1 another
2 any
1 anyone
1 arches
1 are
1 arm
1 armour
7 as
2 assistant
...
Our Use Case
distinct()
3rd party
3rd party
distinct()
join()
join()
union() distinct() foreach()
1st party
Questions?
Contact Info
mkemp@signal.co
@mattkemp
/in/matthewkemp
"Building warehousing systems on
Redshift"
Tristan Crockett

Software Engineer at Edgeflip
Tweet: @thcrock
#cloudcamp
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Redshift: Lessons Learned
Tristan Crockett – Software Engineer, Edgeflip
Basics
● Analytical database
● PostgreSQL with column storage engine
● Automatic Data compression
● No traditional indexes; specify a sort key (how
are records in the table sorted?) and
distribution key (which node will house a
record?)
My Work with Redshift
● Data warehouse for Facebook user feeds and
related app data
● Inputs
– RDS (MySQL)
– DynamoDB
– Facebook
● Stats
– ~2TB of compressed data
– Two main tables, ~5bil and ~25bil rows respectively
Advantages / Disadvantages
● Fast at copying data in from S3
● Fast at computing aggregate/analytical
functions over an entire table
● Decent at intra-db operations (create table as
select, insert into select)
● Most everything else is slow
● Without traditional indexes, table design isn't as
flexible
Lessons/Tips
● Optimize load size (1 MB to 1 GB per file)
● Compress input
● Upsert when needed, and always vacuum
● Don't populate tables with 'CREATE TABLE AS'
if you like compression (which you do)
● To avoid complicated joins, consider computing
single-table aggregates and join on the results
Upsert
Keep an Eye on Compression
Single-Table Aggregates
Thanks for Listening!
tristan.h.crockett@gmail.com
@thcrock
Un-panel Discussion
volunteer to join the panel & ask
questions from the floor!
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Unconference
Small groups & discussions, network
Pizza’s almost here!
#cloudcamp
@CloudCamp_CHI
Sponsored by
Hosted by
Ad

More Related Content

What's hot (16)

Containerizing the Cloud with Kubernetes and Docker
Containerizing the Cloud with Kubernetes and DockerContainerizing the Cloud with Kubernetes and Docker
Containerizing the Cloud with Kubernetes and Docker
James Chittenden
 
Special Purpose Quantum Annealing Quantum Computer v1.0
Special Purpose Quantum Annealing Quantum Computer v1.0Special Purpose Quantum Annealing Quantum Computer v1.0
Special Purpose Quantum Annealing Quantum Computer v1.0
Aditya Yadav
 
Streaming Analytics for Financial Enterprises
Streaming Analytics for Financial EnterprisesStreaming Analytics for Financial Enterprises
Streaming Analytics for Financial Enterprises
Databricks
 
Microservices meetup April 2017
Microservices meetup April 2017Microservices meetup April 2017
Microservices meetup April 2017
SignalFx
 
The Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt BrownThe Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt Brown
Data Con LA
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
Lynn Langit
 
Security From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics Perspective
All Things Open
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost Optimization
Adrian Cockcroft
 
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Aditya Yadav
 
Capacity Planning with Free Tools
Capacity Planning with Free ToolsCapacity Planning with Free Tools
Capacity Planning with Free Tools
Adrian Cockcroft
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
Alluxio, Inc.
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
Krzysztof Adamski
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Chris Jang
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP Helps
RightScale
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
iguazio
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Cohesive Networks
 
Containerizing the Cloud with Kubernetes and Docker
Containerizing the Cloud with Kubernetes and DockerContainerizing the Cloud with Kubernetes and Docker
Containerizing the Cloud with Kubernetes and Docker
James Chittenden
 
Special Purpose Quantum Annealing Quantum Computer v1.0
Special Purpose Quantum Annealing Quantum Computer v1.0Special Purpose Quantum Annealing Quantum Computer v1.0
Special Purpose Quantum Annealing Quantum Computer v1.0
Aditya Yadav
 
Streaming Analytics for Financial Enterprises
Streaming Analytics for Financial EnterprisesStreaming Analytics for Financial Enterprises
Streaming Analytics for Financial Enterprises
Databricks
 
Microservices meetup April 2017
Microservices meetup April 2017Microservices meetup April 2017
Microservices meetup April 2017
SignalFx
 
The Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt BrownThe Netflix data platform: Now and in the future by Kurt Brown
The Netflix data platform: Now and in the future by Kurt Brown
Data Con LA
 
Genome-scale Big Data Pipelines
Genome-scale Big Data PipelinesGenome-scale Big Data Pipelines
Genome-scale Big Data Pipelines
Lynn Langit
 
Security From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics Perspective
All Things Open
 
Cloud Native Cost Optimization
Cloud Native Cost OptimizationCloud Native Cost Optimization
Cloud Native Cost Optimization
Adrian Cockcroft
 
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Automatski - RSA-2048 Cryptography Cracked using Shor's Algorithm on a Quantu...
Aditya Yadav
 
Capacity Planning with Free Tools
Capacity Planning with Free ToolsCapacity Planning with Free Tools
Capacity Planning with Free Tools
Adrian Cockcroft
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
Alluxio, Inc.
 
Elephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud readyElephants in the cloud or how to become cloud ready
Elephants in the cloud or how to become cloud ready
Krzysztof Adamski
 
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Chris Jang
 
Got a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP HelpsGot a Multi-Cloud Strategy? How RightScale CMP Helps
Got a Multi-Cloud Strategy? How RightScale CMP Helps
RightScale
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
iguazio
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Cohesive Networks
 

Viewers also liked (15)

Native container monitoring
Native container monitoringNative container monitoring
Native container monitoring
Rohit Jnagal
 
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Sylvain Kalache
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
Datadog
 
Lifting the Blinds: Monitoring Windows Server 2012
Lifting the Blinds: Monitoring Windows Server 2012Lifting the Blinds: Monitoring Windows Server 2012
Lifting the Blinds: Monitoring Windows Server 2012
Datadog
 
Data Logging and Telemetry
Data Logging and TelemetryData Logging and Telemetry
Data Logging and Telemetry
Francesco Meschia
 
Deep-Dive to Application Insights
Deep-Dive to Application Insights Deep-Dive to Application Insights
Deep-Dive to Application Insights
Gunnar Peipman
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg
 
Sysdig Monitorama Slides
Sysdig Monitorama SlidesSysdig Monitorama Slides
Sysdig Monitorama Slides
Loris Degioanni
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
LN Renganarayana
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to Roots
Brendan Gregg
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
Adrian Cockcroft
 
Container Orchestration Wars
Container Orchestration WarsContainer Orchestration Wars
Container Orchestration Wars
Karl Isenberg
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
Docker, Inc.
 
Persistent storage tailored for containers
Persistent storage tailored for containersPersistent storage tailored for containers
Persistent storage tailored for containers
Docker, Inc.
 
The New Stack Container Summit Talk
The New Stack Container Summit TalkThe New Stack Container Summit Talk
The New Stack Container Summit Talk
The New Stack
 
Native container monitoring
Native container monitoringNative container monitoring
Native container monitoring
Rohit Jnagal
 
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSkynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud
Sylvain Kalache
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
Datadog
 
Lifting the Blinds: Monitoring Windows Server 2012
Lifting the Blinds: Monitoring Windows Server 2012Lifting the Blinds: Monitoring Windows Server 2012
Lifting the Blinds: Monitoring Windows Server 2012
Datadog
 
Deep-Dive to Application Insights
Deep-Dive to Application Insights Deep-Dive to Application Insights
Deep-Dive to Application Insights
Gunnar Peipman
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg
 
Sysdig Monitorama Slides
Sysdig Monitorama SlidesSysdig Monitorama Slides
Sysdig Monitorama Slides
Loris Degioanni
 
Volta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a ServiceVolta: Logging, Metrics, and Monitoring as a Service
Volta: Logging, Metrics, and Monitoring as a Service
LN Renganarayana
 
Netflix: From Clouds to Roots
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to Roots
Brendan Gregg
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
Adrian Cockcroft
 
Container Orchestration Wars
Container Orchestration WarsContainer Orchestration Wars
Container Orchestration Wars
Karl Isenberg
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
Docker, Inc.
 
Persistent storage tailored for containers
Persistent storage tailored for containersPersistent storage tailored for containers
Persistent storage tailored for containers
Docker, Inc.
 
The New Stack Container Summit Talk
The New Stack Container Summit TalkThe New Stack Container Summit Talk
The New Stack Container Summit Talk
The New Stack
 
Ad

Similar to CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides (20)

CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago
 
300k messages/min in an IoT serverless system
300k messages/min in an IoT serverless system300k messages/min in an IoT serverless system
300k messages/min in an IoT serverless system
Alex Pshul
 
Introduction and Overview of OpenStack for IaaS
Introduction and Overview of OpenStack for IaaSIntroduction and Overview of OpenStack for IaaS
Introduction and Overview of OpenStack for IaaS
Keith Basil
 
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Keith Kraus
 
Building Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.ioBuilding Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.io
Lenses.io
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Microsoft Dryad
Microsoft DryadMicrosoft Dryad
Microsoft Dryad
Colin Clark
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
confluent
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
Lew Tucker
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Lightbend
 
2014 DATA @ NFLX (Tableau Customer Conference)
2014 DATA @ NFLX (Tableau Customer Conference)2014 DATA @ NFLX (Tableau Customer Conference)
2014 DATA @ NFLX (Tableau Customer Conference)
Albert Wong
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?
Guido Schmutz
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
ITCamp
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
CodeValue
 
Scalable Open-Source IoT Solutions on Microsoft Azure
Scalable Open-Source IoT Solutions on Microsoft AzureScalable Open-Source IoT Solutions on Microsoft Azure
Scalable Open-Source IoT Solutions on Microsoft Azure
Maxim Ivannikov
 
Cytoscape CI Chapter 2
Cytoscape CI Chapter 2Cytoscape CI Chapter 2
Cytoscape CI Chapter 2
bdemchak
 
2017 12 lab informatics summit
2017 12 lab informatics summit2017 12 lab informatics summit
2017 12 lab informatics summit
Chris Dwan
 
cloud computing
cloud computingcloud computing
cloud computing
Krishna Kumar
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
SoftServe
 
CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago Jan 2015 - The Guts of the Cloud (full slides)
CloudCamp Chicago
 
300k messages/min in an IoT serverless system
300k messages/min in an IoT serverless system300k messages/min in an IoT serverless system
300k messages/min in an IoT serverless system
Alex Pshul
 
Introduction and Overview of OpenStack for IaaS
Introduction and Overview of OpenStack for IaaSIntroduction and Overview of OpenStack for IaaS
Introduction and Overview of OpenStack for IaaS
Keith Basil
 
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Streaming Cyber Security into Graph: Accelerating Data into DataStax Graph an...
Keith Kraus
 
Building Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.ioBuilding Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.io
Lenses.io
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
confluent
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
Lew Tucker
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Lightbend
 
2014 DATA @ NFLX (Tableau Customer Conference)
2014 DATA @ NFLX (Tableau Customer Conference)2014 DATA @ NFLX (Tableau Customer Conference)
2014 DATA @ NFLX (Tableau Customer Conference)
Albert Wong
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?
Guido Schmutz
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
ITCamp
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
Alex Pshul: What We Learned by Testing Execution of 300K Messages/Min in a Se...
CodeValue
 
Scalable Open-Source IoT Solutions on Microsoft Azure
Scalable Open-Source IoT Solutions on Microsoft AzureScalable Open-Source IoT Solutions on Microsoft Azure
Scalable Open-Source IoT Solutions on Microsoft Azure
Maxim Ivannikov
 
Cytoscape CI Chapter 2
Cytoscape CI Chapter 2Cytoscape CI Chapter 2
Cytoscape CI Chapter 2
bdemchak
 
2017 12 lab informatics summit
2017 12 lab informatics summit2017 12 lab informatics summit
2017 12 lab informatics summit
Chris Dwan
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
SoftServe
 
Ad

More from CloudCamp Chicago (20)

Chicago AWS Architectural Resilience Day 2024
Chicago AWS Architectural Resilience Day 2024Chicago AWS Architectural Resilience Day 2024
Chicago AWS Architectural Resilience Day 2024
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago lightning talk IoT in Healthcare CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago
 
CloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago - June 17, 2015 The Internet of ThingsCloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Building warehousing systems on Redshi...
CloudCamp Chicago lightning talk      "Building warehousing systems on Redshi...CloudCamp Chicago lightning talk      "Building warehousing systems on Redshi...
CloudCamp Chicago lightning talk "Building warehousing systems on Redshi...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago lightning talk      "Spark: A Quick Ignition" - Matthew Kem...CloudCamp Chicago lightning talk      "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago lightning talk "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Big Data without Big Infrastructure" by ...
CloudCamp Chicago lightning talk    "Big Data without Big Infrastructure" by ...CloudCamp Chicago lightning talk    "Big Data without Big Infrastructure" by ...
CloudCamp Chicago lightning talk "Big Data without Big Infrastructure" by ...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago April 2015 - "FinTech"CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago
 
CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago - March 2nd 2015 - Cloud Security CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago
 
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago
 
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/OCloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
 Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re... Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
 Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ... Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
Cloudcamp Chicago Nov 2104 Fintech - Aziz LalljeeCloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
Cloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
CloudCamp Chicago
 
Chicago AWS Architectural Resilience Day 2024
Chicago AWS Architectural Resilience Day 2024Chicago AWS Architectural Resilience Day 2024
Chicago AWS Architectural Resilience Day 2024
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago lightning talk "IoT Perspectives from the Trenches" - Steve...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago lightning talk IoT in Healthcare CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago lightning talk IoT in Healthcare
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago lightning talk "Connecting Vehicles on Google Cloud Platfor...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago lightning talk "The Internet of (Insecure) Things" - Chandl...
CloudCamp Chicago
 
CloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago - June 17, 2015 The Internet of ThingsCloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago - June 17, 2015 The Internet of Things
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Building warehousing systems on Redshi...
CloudCamp Chicago lightning talk      "Building warehousing systems on Redshi...CloudCamp Chicago lightning talk      "Building warehousing systems on Redshi...
CloudCamp Chicago lightning talk "Building warehousing systems on Redshi...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago lightning talk      "Spark: A Quick Ignition" - Matthew Kem...CloudCamp Chicago lightning talk      "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago lightning talk "Spark: A Quick Ignition" - Matthew Kem...
CloudCamp Chicago
 
CloudCamp Chicago lightning talk "Big Data without Big Infrastructure" by ...
CloudCamp Chicago lightning talk    "Big Data without Big Infrastructure" by ...CloudCamp Chicago lightning talk    "Big Data without Big Infrastructure" by ...
CloudCamp Chicago lightning talk "Big Data without Big Infrastructure" by ...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago April 2015 - Patrick Kerpan's talk "What Financial Cloud Sh...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago April 2015 - Eero Pikat's talk "Micro-services and how they...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago April 2015 - John Downey's talk "Put away the credit card, ...
CloudCamp Chicago
 
CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago April 2015 - "FinTech"CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago April 2015 - "FinTech"
CloudCamp Chicago
 
CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago - March 2nd 2015 - Cloud Security CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago - March 2nd 2015 - Cloud Security
CloudCamp Chicago
 
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago March 2nd Lightning talk from Jim Tarantino at MarkITx
CloudCamp Chicago
 
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/OCloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago March 2nd Lightning talk from Michael Roytman at Risk I/O
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
 Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re... Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
Cloudcamp Chicago Nov 2104 Fintech - Chris Hacker’s "Change is coming for re...
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
Cloudcamp Chicago Nov 2104 Fintech - Dwight Koop "East / West Chalkboard Talk"
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
 Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ... Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
Cloudcamp Chicago Nov 2104 Fintech - Justin Bouchard’s "Using Technology at ...
CloudCamp Chicago
 
Cloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
Cloudcamp Chicago Nov 2104 Fintech - Aziz LalljeeCloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
Cloudcamp Chicago Nov 2104 Fintech - Aziz Lalljee
CloudCamp Chicago
 

Recently uploaded (20)

Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 

CloudCamp Chicago - Big Data & Cloud May 2015 - All Slides

  • 1. CloudCamp Chicago “Big Data and Cloud” #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 2. Emcee Margaret Walker
 Cohesive Networks Tweet: @CloudCamp_Chi
 #cloudcamp #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 3. … sponsored by you! William Knowles - Evident.io Adam Kallish - IBM Craig Hancock - HealthEngine Brandon Pittman -VMware Chuck Mackie - Maven Wave Partners Brad Foster - Maven Wave Partners Kim Neuwirth - Narrative Science PiaOpulencia - Narrative Science JimStiller - CloudTechnology Partners Networks Brian Lickenbrock - EY
  • 4. 6:00 pm Introductions 6:05 pm: Lightning Talks "Big Data without Big Infrastructure" - Dan Chuparkoff,VP of Product at Civis Analytics @Chuparkoff "Simplicity, Storytelling and Big Data" - Craig Booth, Data Engineer at Narrative Science @craigmbooth "Spark:A Quick Ignition" - Matthew Kemp,Team Lead & Engineer of Things at Signal @mattkemp "Building warehousing systems on Redshift" - Tristan Crockett, Software Engineer at Edgeflip @thcrock 7:00 pm: Unpanel 7:45 pm: Unconference / Networking, drinks and pizza Agenda #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 5. "Big Data without Big Infrastructure" Dan Chuparkoff
 VP of Product at Civis Analytics Tweet: @Chuparkoff #cloudcamp #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 6. @chuparkoff BIG Data without BIG Infrastructure Dan Chuparkoff VP of Product Civis Analytics
  • 7. @chuparkoff Big Data without Big Infrastructure Civis is an easy-to-use, incredibly extensible data science platform in the cloud for teams who want to make great data-driven decisions to drive their organizations forward. I work at Civis
  • 8. Big Data without Big Infrastructure@chuparkoff “The ability to use the data that you’ve built up in the past to inform & improve what you’re going to do in the future.” Big Data at Civis Analytics
  • 9. @chuparkoff Big Data without Big Infrastructure Data science is too damn hard have a report every day that says what happened yesterday? apply predictive modeling to improve my customer retention? to use data from my past to improve acquisition in the future? Why can’t I… ? ? ?
  • 10. @chuparkoff Big Data without Big Infrastructure Everyone’s story •  Aggregate •  Unify •  Explore •  Optimize •  Share •  Automate
  • 11. Big Data without Big Infrastructure@chuparkoff Where should we start? Cloud OnPrem vs.  
  • 12. @chuparkoff Big Data without Big Infrastructure Civis Analytics uses AWS
  • 13. @chuparkoff Big Data without Big Infrastructure •  No hardware costs and infinitely scalable •  Safety and security of AWS •  Automatic backups to multiple data centers •  Access from any computer with an internet connection
  • 14. @chuparkoff Big Data without Big Infrastructure Redshift   S3  EC2   DynamoDB   RDS   EMR  
  • 15. @chuparkoff Big Data without Big Infrastructure
  • 16. @chuparkoff Big Data without Big Infrastructure
  • 17. @chuparkoff Big Data without Big Infrastructure
  • 18. @chuparkoff Big Data without Big Infrastructure Civis data streams aggregate data from virtually any source. Get all pf your data together in one place. Aggregate From data to activation
  • 19. @chuparkoff Big Data without Big Infrastructure Next, Civis’ intelligent matching algorithms link data in disparate data stores. No matter where your data starts, Civis helps you build a unified data repository. Unify From data to activation
  • 20. @chuparkoff Big Data without Big Infrastructure Explore and transform the data in a fast analytics database. Explore From data to activation
  • 21. @chuparkoff Big Data without Big Infrastructure Build powerful predictive models and easily score results with the Civis platform’s advanced modeling engine. This is the heart of data- driven decision making! Optimize From data to activation
  • 22. @chuparkoff Big Data without Big Infrastructure Create, automate, & share reports across your team. Empower your entire organization to move forward with precision. Share From data to activation
  • 23. @chuparkoff Big Data without Big Infrastructure When tomorrow comes there’s no need to reinvent the wheel. Civis let’s you automate and schedule from start to finish, so you can get back to pushing boundaries. Automate From data to activation
  • 24. @chuparkoff Big Data without Big Infrastructure Big Data + the Cloud + AWS helps Civis Analytics turn an analyst into a data scientist & a data scientist into a team of data scientists. Thanks!
  • 25. "Simplicity, Storytelling and Big Data" Craig Booth Data Engineer at Narrative Science Tweet: @craigmbooth #cloudcamp #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 26. Simplicity, Storytelling & Big Data Craig Booth What I Wish I Knew About Big Data On Day One.
  • 27. My Background data driven science 30+ journal articles; complex analytics on 10s of TB of data data powered storytelling
  • 32. Credit: Josh Bloom Henrik Brink of wise.io
  • 33. “…more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize” Xavier Amatriain and Justin Basilico, Netflix
  • 34. “We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment.” Xavier Amatriain and Justin Basilico, Netflix Explainability Im plem entability AccuracyC an Icom m unicate results? H ow longw illittake m e to build? C an Itolerate som e errors?
  • 35. "Spark:A Quick Ignition" Matthew Kemp Team Lead & Engineer of Things at Signal Tweet: @mattkemp #cloudcamp #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 37. Provides distributed processing Main unit of abstraction is the RDD Can be used with frameworks like Mesos or Yarn Supports Java, Python and Scala https://spark.apache.org/ What is Spark?
  • 38. Can be created from… Files or HDFS In memory iterable Cassandra or SQL tables Transformations Lazily create a new RDD from an existing one Actions Usually return a value, force computation of RDD Resilient Distributed Dataset
  • 41. Sample Text Spark Example Spark Shell Shell Example Gists
  • 43. #!/bin/python regex = re.compile('[%s]' % re.escape(string.punctuation)) def word_count(sc, in_file_name, out_file_name): sc.textFile(in_file_name) .map(lambda line: regex.sub(' ', line).strip().lower()) .flatMap(lambda line: [ (word, 1) for word in line.split() ]) .reduceByKey(lambda a, b: a + b) .map(lambda (word, count): '%s,%s' % (word, count)) .saveAsTextFile(out_file_name) Example: Word Count
  • 44. #!/bin/python regex = re.compile('[%s]' % re.escape(string.punctuation)) def word_count(sc, in_file_name, out_file_name): sc.textFile(in_file_name) .map(lambda line: regex.sub(' ', line)) .map(lambda line: line.strip()) .map(lambda line: line.lower()) .flatMap(lambda line: line.split()) .map(lambda word: (word, 1)) .reduceByKey(lambda a, b: a + b) .map(lambda (word, count): '%s,%s' % (word, count)) .saveAsTextFile(out_file_name) Example: Alternate Word Count
  • 45. $ pyspark ... Using Python version 2.7.2 (default) SparkContext available as sc. >>> from word_count import word_count >>> word_count(sc, 'text.txt', 'text_counts') Running the Example
  • 46. a,23 able,1 about,6 above,1 accept,1 accuse,1 ago,2 alarm,2 all,7 although,1 always,2 an,1 The Results From Spark and,26 anger,1 another,1 any,2 anyone,1 arches,1 are,1 arm,1 armour,1 as,7 assistant,2 ...
  • 47. #!/bin/bash text=$(cat ${1} | tr "[:punct:]" " " | tr "[:upper:]" "[:lower:]") parsed=(${text}) for w in ${parsed[@]}; do echo ${w}; done | sort | uniq -c A (Bad) Shell Version
  • 48. 23 a 1 able 6 about 1 above 1 accept 1 accuse 2 ago 2 alarm 7 all 1 although 2 always 1 an The Results From the Shell 26 and 1 anger 1 another 2 any 1 anyone 1 arches 1 are 1 arm 1 armour 7 as 2 assistant ...
  • 49. Our Use Case distinct() 3rd party 3rd party distinct() join() join() union() distinct() foreach() 1st party
  • 52. "Building warehousing systems on Redshift" Tristan Crockett
 Software Engineer at Edgeflip Tweet: @thcrock #cloudcamp #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 53. Redshift: Lessons Learned Tristan Crockett – Software Engineer, Edgeflip
  • 54. Basics ● Analytical database ● PostgreSQL with column storage engine ● Automatic Data compression ● No traditional indexes; specify a sort key (how are records in the table sorted?) and distribution key (which node will house a record?)
  • 55. My Work with Redshift ● Data warehouse for Facebook user feeds and related app data ● Inputs – RDS (MySQL) – DynamoDB – Facebook ● Stats – ~2TB of compressed data – Two main tables, ~5bil and ~25bil rows respectively
  • 56. Advantages / Disadvantages ● Fast at copying data in from S3 ● Fast at computing aggregate/analytical functions over an entire table ● Decent at intra-db operations (create table as select, insert into select) ● Most everything else is slow ● Without traditional indexes, table design isn't as flexible
  • 57. Lessons/Tips ● Optimize load size (1 MB to 1 GB per file) ● Compress input ● Upsert when needed, and always vacuum ● Don't populate tables with 'CREATE TABLE AS' if you like compression (which you do) ● To avoid complicated joins, consider computing single-table aggregates and join on the results
  • 59. Keep an Eye on Compression
  • 62. Un-panel Discussion volunteer to join the panel & ask questions from the floor! #cloudcamp @CloudCamp_CHI Sponsored by Hosted by
  • 63. Unconference Small groups & discussions, network Pizza’s almost here! #cloudcamp @CloudCamp_CHI Sponsored by Hosted by