SlideShare a Scribd company logo
Powering real-time big data analytics
with a next-gen GPU database
November 1, 2017
Matt Aslett
Research Director, Data Platforms
and Analytics Channel
451 Research
Dipti Borkar
Vice President, Product Marketing
Kinetica
Housekeeping Items
2
Questions?
A copy of the presentation will be
provided to all attendeesPresentation Slides
Feedback
To ask a question, click on the
question button
Don’t forget to leave feedback
at the end of the webinar
Today’s speakers
3
Matt Aslett
Research Director, Data Platforms and Analytics Channel, 451 Research
Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and
analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality,
data management, analytics, machine learning and advanced analytics. Matt's own primary area of focus includes data
management, reporting and analytics, and exploring how the various data platforms and analytics technology sectors
are converging in the form of next-generation data platforms.
Dipti Borkar
Vice President, Product Marketing, Kinetica
Dipti has over 15 years experience in database technology across relational and non-relational databases. Prior to
Kinetica, Dipti was Vice President of Product Marketing at Couchbase and held several leadership positions there
including Head of Global Technical Sales and Head of Product Management.
Earlier in her career Dipti was a part of the product team at MarkLogic and managed development teams at IBM DB2
where she started her career as a database software engineer. Dipti holds a Masters degree in Computer Science from
the University of California, San Diego with a specialization in databases, and an MBA from the Haas School of
Business at University of California, Berkeley.
Powering real-time big data analytics
with a next-gen GPU database
Matt Aslett
Research Director, Data Platforms & Analytics
451 Research is a leading IT research & advisory company
5
Founded in 2000
300+ employees, including over 120 analysts
2,000+ clients: Technology & Service providers, corporate
advisory, finance, professional services, and IT decision makers
70,000+ IT professionals, business users and consumers in our research
community
Over 52 million data points published each quarter and 4,500+ reports
published each year
3,000+ technology & service providers under coverage
451 Research and its sister company, Uptime Institute, are the two divisions
of The 451 Group
Headquartered in New York City, with offices in London, Boston, San
Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia,
Taiwan, Singapore and Malaysia
Research & Data
Advisory
Events
Go 2 Market
6
Big data and beyond
7
• V is for various things…
but does not define big data
• To understand the trends driving ‘big
data 451 Research focused beyond the
nature of the data on what enterprises
wanted to do with it
Big data and beyond
8
• V is for various things…
but does not define big data
• To understand the trends driving ‘big
data 451 Research focused beyond the
nature of the data on what enterprises
wanted to do with it
• Totality – storing and processing all data (or as much as is economically viable
• Exploration – schema-free approaches to analyzing data to identify new
patterns
• Frequency – more frequent analysis of data to enable real-time decision
making
Traditional systems of engagement and analysis
9
New systems of analysis
10
New systems of engagement
11
New systems of intelligence
12
New systems of intelligence
13
Emergence of GPU databases
▪ Potential customers that are doing deep
learning and more advanced analytics on
HPC systems that leverage GPU
processors
▪ Data scientists or other specialists need
to pull data from a system of record and
load it into an HPC system to perform the
analytics leveraging certain algorithms.
14
15
Emergence of GPU databases
• While HPC systems are well equipped to
handle advanced analytics because they
leverage GPUs, there is also a price to be
paid as it requires moving the data from
one system to the other.
• GPU databases open up the door for
machine learning, deep learning and
other advanced analytical workloads to
be run alongside BI workloads, within the
same environment.
CPUs and GPUs
• A CPU is a very good general processor,
handling a variety of complex tasks well.
• A GPU, is more specialized and can do
certain tasks extremely well.
• CPUs consist of multiple cores
• GPUs consist of thousands of cores
• CPUs geared for serial operations
• GPUs geared for parallel operations
▪ Can be paired together for the greatest overall optimization 16
What’s required for analytics?
17
Methods
Data
Processing
CPUs for standard SQL-based BI
18
Methods
Data
Processing
SQL
CPU
GPUs extend analytical benefits
19
Methods
Data
Processing
SQL
CPU
ML/DL
GPU
Some benefits of GPUs
▪ Performance,
acceleration
▪ Data sets, large/scale
▪ Analytics, machine
learning, deep learning
▪ Querying, real-time
dashboards, reports
▪ Visualization,
interactive, drill down
Key takeaways
20
Thank You!
matthew.aslett@451research.com
@maslett
www.451research.com
21
Powering real-time big data analytics with a
next-gen GPU database
Dipti Borkar| VP, product Marketing| dborkar@kinetica.com
Company
80+, enterprise and startup expertise
Awards Customers and Partners
Investors
$50m Series A June 2017
Ray Lane
Company| Summary
2014
2016
23
Advances in Big Data Processing
DATA WAREHOUSE
RDBMS & Data Warehouse
technologies enable
organizations to store and
analyze growing volumes of data
on high performance machines,
but at high cost.
DISTRIBUTED STORAGE
Hadoop and MapReduce
enables distributed storage and
processing across multiple
machines.
Storing massive volumes of data
becomes more affordable, but
performance is slow
AFFORDABLE MEMORY
Affordable memory allows for
faster data read and write.
HANA, MemSQL, & Exadata
provide faster analytics.
1990 - 2000’s 2005… 2010… 2017…
AT SCALE PROCESSING
BECOMES THE
BOTTLENECK
GPU ACCELERATED COMPUTE
GPU cores bulk process tasks in
parallel - far more efficient for many
data-intensive tasks than CPUs
which process those tasks linearly.
24
GPU | Tale of Numbers
100x
75%
Performance
>100x gains over traditional
RDBMS / NoSQL / In-Mem
Databases
Cores
Modern GPUs can consist of
up to 3000+ cores compared
to 32 in a CPU
Costs
75% reduction in
infrastructure costs, licensing,
staff, etc.
More with Less
Increase performance,
throughput, capability while
minimizing the costs to
support the business
GPUs are designed around thousands of small, efficient cores that
are well suited to performing repeated similar instructions in
parallel – making them ideal for the compute-intensive workloads
required of large data sets.
Performance Increase
Infrastructure Cost Savings
4000vs.
32
25
Kinetica: Core
26
ANALYTICS DATABASE ACCELERATED BY GPUs
KINETICA
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
GPU Accelerated
Columnar In-memory Database
HTTP Head Node
Columnar in-memory database
Data available much like a traditional RDBMS… rows,
columns
Data held in-memory; persisted to disk
Interact with Kinetica through its native REST API,
Java, Python, JavaScript, NodeJS, C++, SQL, etc… as
well as with various connectors
Native GIS & IP address object support
VERY FAST: Ideal for OLAP workloads
Typical hardware setup: 256GB - 1TB
memory with 2-4 GPUs per node.
Kinetica Architecture
27
ETL / STREAM
PROCESSING
ON DEMAND SCALE OUT +
1TB MEM / 2 GPU CARDS
SQL
Native
APIs
PARALLELINGEST
Geospatial
WMS
Custom
Connectors
In-Database Processing
CUSTOM
LOGIC BIDMach
ML
Libs
BI DASHBOARDS
BI / GIS / APPS
CUSTOM APPS
& GEOSPATIAL
KINETICA ‘REVEAL’
STREAMINGDATAERP/CRM/
TRANSACTIONALDATA
UDFs
The Kinetica cluster architecture
VISUALIZATION via ODBC/JDBCAPIs
Java API
JavaScript API
REST API
C++ API
Node.js API
Python API
OPEN SOURCE
INTEGRATION
Apache NiFi
Apache Kafka
Apache Spark
Apache Storm
GEOSPATIAL CAPABILITIES
Geometric
Objects
Tracks
Geospatial
Endpoints
WMS
WKT
KINETICA CLUSTER
On-Demand Scale
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
Commodity Hardware
w/ GPUs
Disk
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Columnar
In-memory
HTTP Server
OTHER
INTEGRATION
Message Queues
ETL Tools
Streaming Tools
28
Parallel Ingest Provides High Performance Streaming
29
1 NODE (1TB/2GPU)
PARALLEL
INGEST
1 NODE (1TB/2GPU)
1 NODE (1TB/2GPU)
Each node of the system can share the task of data
ingest, provides more and faster throughput. It can
always be made faster simply by adding more nodes.
50-100x Faster on Queries with Large Datasets
• Large retailer tested complex SQL queries
on 3 years of retail data (150bn rows)
• 10 node Kinetica cluster against 30TB+
cluster from next best alternative
• GPU is able to perform many instructions in
parallel. Huge performance gains on
aggregations, group bys, joins, etc.
• Kinetica sustained ingest of 1.3bn
objects/minute with 70 attributes per row
30
WHEN COMPARED TO LEADING IN-MEMORY ALTERNATIVES
Combined Strengths and Capabilities
Kinetica | Combined Strengths and Capabilities
Supercharge
BI
Taking advantage of the parallel nature
of the GPU, Kinetica delivers low-
latency, high-performance analytics on
large and steaming data sets.
Simultaneously ingest,
explore, analyze, and
visualize data within
milliseconds to make critical
decisions.
User-defined functions (UDFs) allow
for distributed custom compute
directly from within the database.
Easier to work with large
geospatial data sets.
Fast, Distributed
Database Engine
In-Database
Analytics
Native
Geospatial &
Visualization
Pipeline
32
Copyright (C) 2017 451 Research LLC
New systems of intelligence
33
Use Cases
FASTER BI WITH A GPU DATABASE
35
Tableau + Kinetica
Kinetica combines GPU’s brute-force compute with the
simplicity of a relational database for millisecond query
response on massive data sets without extensive
tuning.
• Incredibly fast query performance.
• Distributed design - ideal for large and streaming datasets.
• SQL-92 compliant relational database – without limits.
• More power means less need for tuning, indexing, and
administration of the database.
• No need to do pre-aggregation or build out cubes.
• Reduce reliance on specialized skills to prep and set-up
data.
36
Rethink interaction between business analyst & data scientist
SPECIALIZED AI/ DATA
SCIENCE TOOLS
SUBSET
DATA SCIENTISTS
BUSINESS USERS
EXTRACT
EXTRACTING DATA FOR AI IS
EXPENSIVE AND SLOW
ENTERPRISES
STRUGGLE TO MAKE
AI MODELS AVAILABLE
TO BUSINESS
???
• MapReduce
• Spark
• NoSQL DBs
• SQL Databases
• DFS
• CPU Compute Nodes
• GPU Compute Nodes
Proliferation of Hardware &
Software Components
Kinetica | The Ideal Process – Consolidate the BI / AI stack
37
Monte Carlo Risk
Custom Function 2
Custom Function 3
API EXPOSES CUSTOM
FUNCTIONS WHICH CAN BE
MADE AVAILABLE TO BUSINESS
USERS
BUSINESS USERS
DATA SCIENTISTS
UDFs
• Analytics
• AI/ML/Deep Learning
• Power of in-memory SQL
• Integrated CPU/GPU
• Bomb with Streams
Single Database Platform for
AI + BI
AI & BI on One GPU-Accelerated Database
HIGH PERFORMANCE ANALYTICS
DATABASE
UDF UDF UDF
ODBC
/ JDBC Native
REST API WMS
BUSINESS INTELLIGENCE
CUSTOM APPLICATIONS
HIGH FIDELITY
GEOSPATIAL PIPELINE
MACHINE LEARNING
& DEEP LEARNING GPU-ACCELERATED
DATA SCIENCE
PREDICTIVE MODELS
e.g. Risk Management,
Sales Volume, Fraud.
BIDMach
SQL
DATA SCIENTISTS
/ DEVELOPERS
BUSINESS
USERS
38
Distributed Geospatial Pipeline
39
NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA
Native Geospatial Object Types
• Points, Shapes, Tracks, Labels
Native Geospatial Functions
• Filters (by area, by series, by geometry, etc.)
• Aggregation (histograms)
• Geofencing - triggers
• Video generation (based on dates/times)
Generate Map Overlay Imagery (via WMS)
• Rasterize points
• Style based on attributes (class-break)
• Heat maps
Customer Case-studies
ENTERTAINMENT | Customer 360
41
CASE STUDY : BI ACCELERATION
BUSINESS OBJECTIVE
• Accelerate Tableau dashboards for faster customer 360 analytics
NEW CAPABILITIES DELIVERED
• 24X faster dashboard loads
• 3.5X faster slice and dice, drilldowns, filters
SOLUTION OVERVIEW
• Tableau Server and Kinetica running on Google Cloud Platform
• Kinetica accelerates EDW workload
• Simply point to Kinetica using Tableau’s replace data source feature
42
AD TECH | Real-time reporting & ad delivery
CASE STUDY : REAL-TIME DATA AND ANALYTICS
BUSINESS OBJECTIVE
• Be first to market with game changing technologies that put publishers’
needs first
• Support PubMatic’s real-time campaign reporting
NEW CAPABILITIES DELIVERED
• High-speed ingest, store, and persist data processing capabilities
• Ad-hoc analytics on ad impression and bid data
SOLUTION OVERVIEW
• Kinetica considered as a functional replacement for a 40-node Apache
Apex cluster -> smaller HW footprint
• Hi-speed data ingestion via native Kafka integration
• Python access to Kinetica data store for simplified data science discovery
• Contributed fast data capabilities to long term retention and archive
Hadoop Data Lake
“At PubMatic, we are consistently focused on being early to
market with leading technologies that put publishers’ needs
first. Processing over one trillion ad impressions
monthly, PubMatic provides omni-channel revenue automation
technology for publishers and programmatic tools for media
buyers. Leveraging leading edge data and technology
innovation, Kinetica contributes high-speed ingest, store,
and persist data processing capabilities in support
of PubMatic’s real-time reporting and ad pacing engine.”
- Vasu Cherlopalle, Vice President of Big Data and Analytics
One of the things I like about
Kinetica is it gives us more of a
general-purpose use of the
technology. There has been a lot
of software created to answer
certain questions [but] highly
specialized tools have limited
functionality and are tuned to do
a certain workload.
"
Mark Ramsey, Chief Data Officer at GSK
BUSINESS OBJECTIVE
• Faster processing of transcriptomics to run simulations of
chemical reactions for drug discovery, research, and
development
NEW CAPABILITIES DELIVERED
• In-database processing to develop models, leveraging GPU
acceleration for performance, and direct access to CUDA APIs
via UDFs deployed within Kinetica
• Seek out signals from massive collection of drug targets
combined from external data, historical data from
experiments, ad clinical trials
SOLUTION OVERVIEW
• Kinetica running on-premises on a cluster of 7 HPE DL 380
servers
• Familiar relational database with GPU acceleration
LIFE SCIENCES : GENOMICS RESEARCH
CASE STUDY : ADVANCED IN-DATABASE ANALYTICS
43
PIPELINE & WELL ANALYTICS
44
CASE STUDY : LOCATION BASED ANALYTICS
BUSINESS OBJECTIVE
• Augment SaaS offering to provide research data and
analytics on oil and gas to energy investors and operators
with geospatial query, visualization, and analytics
NEW CAPABILITIES DELIVERED
• Geospatial visualization and analytics of massive number of
wells, pipelines by land ownership, region etc.
• Custom visualizations and charts for data-driven insights
• Embedded solution with seamless Node.js integration, GPU
acceleration
SOLUTION OVERVIEW
• Kinetica running in RSEG’s Amazon Web Services VPC
deployment
LOGISTICS | Workforce optimization
BUSINESS OBJECTIVE
• Deliver better business services, optimize operations, and save
costs across 600,000 employees, 215,000 delivery vehicles, and
deliver 500 million pieces of mail daily
NEW CAPABILITIES DELIVERED
• Real-time delivery and pickup notifications, shipment routing,
just-in-time supplies
• Real-time route optimization - route planning, rerouting
• Geospatial analytics to uncover overlapping coverage areas,
uncovered areas, and distribution bottlenecks
SOLUTION OVERVIEW
• USPS runs Kinetica as a 70 TB in-memory database on a HPE DL
380 200 node system. Each node consists of a single X86 blade
server with 1TB RAM, 2 NVIDIA K80 GPUs
• Kinetica collects, processes, and analyzes 200,000 messages
per minute for real-time streaming analytics. 15,000 daily
sessions with 5 9’s uptime
45
PERFORMANCE SCALABLE CONVERGED AI AND BI
INDUSTRY-STANDARD
CONNECTIVITY
 Distributed
 Columnar
 In-Memory
 Relational
 GPU Accelerated
 Ingest, Query, Compute
 Commodity Hardware
 On-premises or Cloud
 Scales to 100’s of TB
 Less Infrastructure
 More Compute
 Predictable, Linear
 Machine Learning
 Artificial Intelligence
 In-Database
 Self-Service
 Open Source
 Kafka, Storm, NiFi, Spark
 ODBC, JDBC
 ANSI SQL/92
 API’s for Java, JS, C++,
Python, Node.js, REST
Summary | Kinetica GPU Accelerated Analytics
46
Thank you!
Dipti Borkar | VP Product Marketing| dborkar@kinetica.com
Ad

More Related Content

What's hot (20)

GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
Rommel Garcia
 
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk AggregationHow GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
Kinetica
 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
Arnon Shimoni
 
Introduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentIntroduction to SQream and the IoT environment
Introduction to SQream and the IoT environment
Arnon Shimoni
 
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Databricks
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Maya Lumbroso
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
Databricks
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
Ognjen Antonic
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Spark Summit
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Cesare Cugnasco
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
Alluxio, Inc.
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
DataWorks Summit
 
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
Dataconomy Media
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
Spark Summit
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
Anant Corporation
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
Mateusz Dymczyk
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
Rommel Garcia
 
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk AggregationHow GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
How GPUs Enable XVA Pricing and Risk Calculations for Risk Aggregation
Kinetica
 
GPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holdsGPU databases - How to use them and what the future holds
GPU databases - How to use them and what the future holds
Arnon Shimoni
 
Introduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentIntroduction to SQream and the IoT environment
Introduction to SQream and the IoT environment
Arnon Shimoni
 
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Geosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's To...
Databricks
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
Maya Lumbroso
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
 
Build Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks StreamingBuild Real-Time Applications with Databricks Streaming
Build Real-Time Applications with Databricks Streaming
Databricks
 
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-103-NOV-1510-Ognjen-Antonic-Telemach-stream-1
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
Ognjen Antonic
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Spark Summit
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Cesare Cugnasco
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
Alluxio, Inc.
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
DataWorks Summit
 
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
Dataconomy Media
 
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
Spark Summit
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
Alluxio, Inc.
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
Anant Corporation
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
DataWorks Summit
 

Similar to Powering Real-Time Big Data Analytics with a Next-Gen GPU Database (20)

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
Skillwise Group
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Rizaldy Ignacio
 
Sql 2017 net raf
Sql 2017  net rafSql 2017  net raf
Sql 2017 net raf
Maximiliano Accotto
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
Skillwise Group
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Sql 2016 2017 full
Sql 2016   2017 fullSql 2016   2017 full
Sql 2016 2017 full
Maximiliano Accotto
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Philip Filleul
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
DataWorks Summit/Hadoop Summit
 
4AA6-4492ENW
4AA6-4492ENW4AA6-4492ENW
4AA6-4492ENW
Michecarly Osirus
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
IntelAPAC
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Denodo
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Denodo
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
Michael Hiskey
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Rizaldy Ignacio
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
Skillwise Group
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FSBitkom Cray presentation - on HPC affecting big data analytics in FS
Bitkom Cray presentation - on HPC affecting big data analytics in FS
Philip Filleul
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
Attunity
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
IntelAPAC
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
DATAVERSITY
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Denodo
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Denodo
 
Ad

Recently uploaded (20)

Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersLinux Support for SMARC: How Toradex Empowers Embedded Developers
Linux Support for SMARC: How Toradex Empowers Embedded Developers
Toradex
 
Ad

Powering Real-Time Big Data Analytics with a Next-Gen GPU Database

  • 1. Powering real-time big data analytics with a next-gen GPU database November 1, 2017 Matt Aslett Research Director, Data Platforms and Analytics Channel 451 Research Dipti Borkar Vice President, Product Marketing Kinetica
  • 2. Housekeeping Items 2 Questions? A copy of the presentation will be provided to all attendeesPresentation Slides Feedback To ask a question, click on the question button Don’t forget to leave feedback at the end of the webinar
  • 3. Today’s speakers 3 Matt Aslett Research Director, Data Platforms and Analytics Channel, 451 Research Matt has overall responsibility for the data platforms and analytics research coverage, which includes operational and analytic databases, Hadoop, grid/cache, stream processing, search-based data platforms, data integration, data quality, data management, analytics, machine learning and advanced analytics. Matt's own primary area of focus includes data management, reporting and analytics, and exploring how the various data platforms and analytics technology sectors are converging in the form of next-generation data platforms. Dipti Borkar Vice President, Product Marketing, Kinetica Dipti has over 15 years experience in database technology across relational and non-relational databases. Prior to Kinetica, Dipti was Vice President of Product Marketing at Couchbase and held several leadership positions there including Head of Global Technical Sales and Head of Product Management. Earlier in her career Dipti was a part of the product team at MarkLogic and managed development teams at IBM DB2 where she started her career as a database software engineer. Dipti holds a Masters degree in Computer Science from the University of California, San Diego with a specialization in databases, and an MBA from the Haas School of Business at University of California, Berkeley.
  • 4. Powering real-time big data analytics with a next-gen GPU database Matt Aslett Research Director, Data Platforms & Analytics
  • 5. 451 Research is a leading IT research & advisory company 5 Founded in 2000 300+ employees, including over 120 analysts 2,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers 70,000+ IT professionals, business users and consumers in our research community Over 52 million data points published each quarter and 4,500+ reports published each year 3,000+ technology & service providers under coverage 451 Research and its sister company, Uptime Institute, are the two divisions of The 451 Group Headquartered in New York City, with offices in London, Boston, San Francisco, Washington DC, Mexico, Costa Rica, Brazil, Spain, UAE, Russia, Taiwan, Singapore and Malaysia Research & Data Advisory Events Go 2 Market
  • 6. 6
  • 7. Big data and beyond 7 • V is for various things… but does not define big data • To understand the trends driving ‘big data 451 Research focused beyond the nature of the data on what enterprises wanted to do with it
  • 8. Big data and beyond 8 • V is for various things… but does not define big data • To understand the trends driving ‘big data 451 Research focused beyond the nature of the data on what enterprises wanted to do with it • Totality – storing and processing all data (or as much as is economically viable • Exploration – schema-free approaches to analyzing data to identify new patterns • Frequency – more frequent analysis of data to enable real-time decision making
  • 9. Traditional systems of engagement and analysis 9
  • 10. New systems of analysis 10
  • 11. New systems of engagement 11
  • 12. New systems of intelligence 12
  • 13. New systems of intelligence 13
  • 14. Emergence of GPU databases ▪ Potential customers that are doing deep learning and more advanced analytics on HPC systems that leverage GPU processors ▪ Data scientists or other specialists need to pull data from a system of record and load it into an HPC system to perform the analytics leveraging certain algorithms. 14
  • 15. 15 Emergence of GPU databases • While HPC systems are well equipped to handle advanced analytics because they leverage GPUs, there is also a price to be paid as it requires moving the data from one system to the other. • GPU databases open up the door for machine learning, deep learning and other advanced analytical workloads to be run alongside BI workloads, within the same environment.
  • 16. CPUs and GPUs • A CPU is a very good general processor, handling a variety of complex tasks well. • A GPU, is more specialized and can do certain tasks extremely well. • CPUs consist of multiple cores • GPUs consist of thousands of cores • CPUs geared for serial operations • GPUs geared for parallel operations ▪ Can be paired together for the greatest overall optimization 16
  • 17. What’s required for analytics? 17 Methods Data Processing
  • 18. CPUs for standard SQL-based BI 18 Methods Data Processing SQL CPU
  • 19. GPUs extend analytical benefits 19 Methods Data Processing SQL CPU ML/DL GPU Some benefits of GPUs ▪ Performance, acceleration ▪ Data sets, large/scale ▪ Analytics, machine learning, deep learning ▪ Querying, real-time dashboards, reports ▪ Visualization, interactive, drill down
  • 22. Powering real-time big data analytics with a next-gen GPU database Dipti Borkar| VP, product Marketing| [email protected]
  • 23. Company 80+, enterprise and startup expertise Awards Customers and Partners Investors $50m Series A June 2017 Ray Lane Company| Summary 2014 2016 23
  • 24. Advances in Big Data Processing DATA WAREHOUSE RDBMS & Data Warehouse technologies enable organizations to store and analyze growing volumes of data on high performance machines, but at high cost. DISTRIBUTED STORAGE Hadoop and MapReduce enables distributed storage and processing across multiple machines. Storing massive volumes of data becomes more affordable, but performance is slow AFFORDABLE MEMORY Affordable memory allows for faster data read and write. HANA, MemSQL, & Exadata provide faster analytics. 1990 - 2000’s 2005… 2010… 2017… AT SCALE PROCESSING BECOMES THE BOTTLENECK GPU ACCELERATED COMPUTE GPU cores bulk process tasks in parallel - far more efficient for many data-intensive tasks than CPUs which process those tasks linearly. 24
  • 25. GPU | Tale of Numbers 100x 75% Performance >100x gains over traditional RDBMS / NoSQL / In-Mem Databases Cores Modern GPUs can consist of up to 3000+ cores compared to 32 in a CPU Costs 75% reduction in infrastructure costs, licensing, staff, etc. More with Less Increase performance, throughput, capability while minimizing the costs to support the business GPUs are designed around thousands of small, efficient cores that are well suited to performing repeated similar instructions in parallel – making them ideal for the compute-intensive workloads required of large data sets. Performance Increase Infrastructure Cost Savings 4000vs. 32 25
  • 26. Kinetica: Core 26 ANALYTICS DATABASE ACCELERATED BY GPUs KINETICA Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 GPU Accelerated Columnar In-memory Database HTTP Head Node Columnar in-memory database Data available much like a traditional RDBMS… rows, columns Data held in-memory; persisted to disk Interact with Kinetica through its native REST API, Java, Python, JavaScript, NodeJS, C++, SQL, etc… as well as with various connectors Native GIS & IP address object support VERY FAST: Ideal for OLAP workloads Typical hardware setup: 256GB - 1TB memory with 2-4 GPUs per node.
  • 27. Kinetica Architecture 27 ETL / STREAM PROCESSING ON DEMAND SCALE OUT + 1TB MEM / 2 GPU CARDS SQL Native APIs PARALLELINGEST Geospatial WMS Custom Connectors In-Database Processing CUSTOM LOGIC BIDMach ML Libs BI DASHBOARDS BI / GIS / APPS CUSTOM APPS & GEOSPATIAL KINETICA ‘REVEAL’ STREAMINGDATAERP/CRM/ TRANSACTIONALDATA UDFs
  • 28. The Kinetica cluster architecture VISUALIZATION via ODBC/JDBCAPIs Java API JavaScript API REST API C++ API Node.js API Python API OPEN SOURCE INTEGRATION Apache NiFi Apache Kafka Apache Spark Apache Storm GEOSPATIAL CAPABILITIES Geometric Objects Tracks Geospatial Endpoints WMS WKT KINETICA CLUSTER On-Demand Scale Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server Commodity Hardware w/ GPUs Disk A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4 Columnar In-memory HTTP Server OTHER INTEGRATION Message Queues ETL Tools Streaming Tools 28
  • 29. Parallel Ingest Provides High Performance Streaming 29 1 NODE (1TB/2GPU) PARALLEL INGEST 1 NODE (1TB/2GPU) 1 NODE (1TB/2GPU) Each node of the system can share the task of data ingest, provides more and faster throughput. It can always be made faster simply by adding more nodes.
  • 30. 50-100x Faster on Queries with Large Datasets • Large retailer tested complex SQL queries on 3 years of retail data (150bn rows) • 10 node Kinetica cluster against 30TB+ cluster from next best alternative • GPU is able to perform many instructions in parallel. Huge performance gains on aggregations, group bys, joins, etc. • Kinetica sustained ingest of 1.3bn objects/minute with 70 attributes per row 30 WHEN COMPARED TO LEADING IN-MEMORY ALTERNATIVES
  • 31. Combined Strengths and Capabilities
  • 32. Kinetica | Combined Strengths and Capabilities Supercharge BI Taking advantage of the parallel nature of the GPU, Kinetica delivers low- latency, high-performance analytics on large and steaming data sets. Simultaneously ingest, explore, analyze, and visualize data within milliseconds to make critical decisions. User-defined functions (UDFs) allow for distributed custom compute directly from within the database. Easier to work with large geospatial data sets. Fast, Distributed Database Engine In-Database Analytics Native Geospatial & Visualization Pipeline 32
  • 33. Copyright (C) 2017 451 Research LLC New systems of intelligence 33
  • 35. FASTER BI WITH A GPU DATABASE 35 Tableau + Kinetica Kinetica combines GPU’s brute-force compute with the simplicity of a relational database for millisecond query response on massive data sets without extensive tuning. • Incredibly fast query performance. • Distributed design - ideal for large and streaming datasets. • SQL-92 compliant relational database – without limits. • More power means less need for tuning, indexing, and administration of the database. • No need to do pre-aggregation or build out cubes. • Reduce reliance on specialized skills to prep and set-up data.
  • 36. 36 Rethink interaction between business analyst & data scientist SPECIALIZED AI/ DATA SCIENCE TOOLS SUBSET DATA SCIENTISTS BUSINESS USERS EXTRACT EXTRACTING DATA FOR AI IS EXPENSIVE AND SLOW ENTERPRISES STRUGGLE TO MAKE AI MODELS AVAILABLE TO BUSINESS ??? • MapReduce • Spark • NoSQL DBs • SQL Databases • DFS • CPU Compute Nodes • GPU Compute Nodes Proliferation of Hardware & Software Components
  • 37. Kinetica | The Ideal Process – Consolidate the BI / AI stack 37 Monte Carlo Risk Custom Function 2 Custom Function 3 API EXPOSES CUSTOM FUNCTIONS WHICH CAN BE MADE AVAILABLE TO BUSINESS USERS BUSINESS USERS DATA SCIENTISTS UDFs • Analytics • AI/ML/Deep Learning • Power of in-memory SQL • Integrated CPU/GPU • Bomb with Streams Single Database Platform for AI + BI
  • 38. AI & BI on One GPU-Accelerated Database HIGH PERFORMANCE ANALYTICS DATABASE UDF UDF UDF ODBC / JDBC Native REST API WMS BUSINESS INTELLIGENCE CUSTOM APPLICATIONS HIGH FIDELITY GEOSPATIAL PIPELINE MACHINE LEARNING & DEEP LEARNING GPU-ACCELERATED DATA SCIENCE PREDICTIVE MODELS e.g. Risk Management, Sales Volume, Fraud. BIDMach SQL DATA SCIENTISTS / DEVELOPERS BUSINESS USERS 38
  • 39. Distributed Geospatial Pipeline 39 NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA Native Geospatial Object Types • Points, Shapes, Tracks, Labels Native Geospatial Functions • Filters (by area, by series, by geometry, etc.) • Aggregation (histograms) • Geofencing - triggers • Video generation (based on dates/times) Generate Map Overlay Imagery (via WMS) • Rasterize points • Style based on attributes (class-break) • Heat maps
  • 41. ENTERTAINMENT | Customer 360 41 CASE STUDY : BI ACCELERATION BUSINESS OBJECTIVE • Accelerate Tableau dashboards for faster customer 360 analytics NEW CAPABILITIES DELIVERED • 24X faster dashboard loads • 3.5X faster slice and dice, drilldowns, filters SOLUTION OVERVIEW • Tableau Server and Kinetica running on Google Cloud Platform • Kinetica accelerates EDW workload • Simply point to Kinetica using Tableau’s replace data source feature
  • 42. 42 AD TECH | Real-time reporting & ad delivery CASE STUDY : REAL-TIME DATA AND ANALYTICS BUSINESS OBJECTIVE • Be first to market with game changing technologies that put publishers’ needs first • Support PubMatic’s real-time campaign reporting NEW CAPABILITIES DELIVERED • High-speed ingest, store, and persist data processing capabilities • Ad-hoc analytics on ad impression and bid data SOLUTION OVERVIEW • Kinetica considered as a functional replacement for a 40-node Apache Apex cluster -> smaller HW footprint • Hi-speed data ingestion via native Kafka integration • Python access to Kinetica data store for simplified data science discovery • Contributed fast data capabilities to long term retention and archive Hadoop Data Lake “At PubMatic, we are consistently focused on being early to market with leading technologies that put publishers’ needs first. Processing over one trillion ad impressions monthly, PubMatic provides omni-channel revenue automation technology for publishers and programmatic tools for media buyers. Leveraging leading edge data and technology innovation, Kinetica contributes high-speed ingest, store, and persist data processing capabilities in support of PubMatic’s real-time reporting and ad pacing engine.” - Vasu Cherlopalle, Vice President of Big Data and Analytics
  • 43. One of the things I like about Kinetica is it gives us more of a general-purpose use of the technology. There has been a lot of software created to answer certain questions [but] highly specialized tools have limited functionality and are tuned to do a certain workload. " Mark Ramsey, Chief Data Officer at GSK BUSINESS OBJECTIVE • Faster processing of transcriptomics to run simulations of chemical reactions for drug discovery, research, and development NEW CAPABILITIES DELIVERED • In-database processing to develop models, leveraging GPU acceleration for performance, and direct access to CUDA APIs via UDFs deployed within Kinetica • Seek out signals from massive collection of drug targets combined from external data, historical data from experiments, ad clinical trials SOLUTION OVERVIEW • Kinetica running on-premises on a cluster of 7 HPE DL 380 servers • Familiar relational database with GPU acceleration LIFE SCIENCES : GENOMICS RESEARCH CASE STUDY : ADVANCED IN-DATABASE ANALYTICS 43
  • 44. PIPELINE & WELL ANALYTICS 44 CASE STUDY : LOCATION BASED ANALYTICS BUSINESS OBJECTIVE • Augment SaaS offering to provide research data and analytics on oil and gas to energy investors and operators with geospatial query, visualization, and analytics NEW CAPABILITIES DELIVERED • Geospatial visualization and analytics of massive number of wells, pipelines by land ownership, region etc. • Custom visualizations and charts for data-driven insights • Embedded solution with seamless Node.js integration, GPU acceleration SOLUTION OVERVIEW • Kinetica running in RSEG’s Amazon Web Services VPC deployment
  • 45. LOGISTICS | Workforce optimization BUSINESS OBJECTIVE • Deliver better business services, optimize operations, and save costs across 600,000 employees, 215,000 delivery vehicles, and deliver 500 million pieces of mail daily NEW CAPABILITIES DELIVERED • Real-time delivery and pickup notifications, shipment routing, just-in-time supplies • Real-time route optimization - route planning, rerouting • Geospatial analytics to uncover overlapping coverage areas, uncovered areas, and distribution bottlenecks SOLUTION OVERVIEW • USPS runs Kinetica as a 70 TB in-memory database on a HPE DL 380 200 node system. Each node consists of a single X86 blade server with 1TB RAM, 2 NVIDIA K80 GPUs • Kinetica collects, processes, and analyzes 200,000 messages per minute for real-time streaming analytics. 15,000 daily sessions with 5 9’s uptime 45
  • 46. PERFORMANCE SCALABLE CONVERGED AI AND BI INDUSTRY-STANDARD CONNECTIVITY  Distributed  Columnar  In-Memory  Relational  GPU Accelerated  Ingest, Query, Compute  Commodity Hardware  On-premises or Cloud  Scales to 100’s of TB  Less Infrastructure  More Compute  Predictable, Linear  Machine Learning  Artificial Intelligence  In-Database  Self-Service  Open Source  Kafka, Storm, NiFi, Spark  ODBC, JDBC  ANSI SQL/92  API’s for Java, JS, C++, Python, Node.js, REST Summary | Kinetica GPU Accelerated Analytics 46
  • 47. Thank you! Dipti Borkar | VP Product Marketing| [email protected]