SlideShare a Scribd company logo
Kyligence Introduction
MicroStrategy Partnership
Saswata Sengupta
© Kyligence Inc. 2019, Confidential.
Apache Kylin
Top Level Apache Project
 The only open-source OLAP on big data
platform
Best Open-source Big Data Tool
 InfoWorld’s Bossies (Best of Open
Source Software Awards) in 2015 &
2016
Sub-Second Interactive Query
 Large scale, high concurrency, sub
second query latency, multi-dimension
1000+ Organizations
 Adopted by thousands of organizations
globally
© Kyligence Inc. 2019, Confidential.
Kyligence = Kylin + Intelligence
• Founded in 2016 by the creators of Apache Kylin
• Built around Kylin, with augmented AI and enhanced to
deliver unprecedented enterprise analytic performance
• CRN Top-10 big data startups in 2018
• Global Presence: San Jose, Seattle, New York, Shanghai,
Beijing
• VCs: Fidelity International, Shunwei Capital, Broadband
Capital, Redpoint, Cisco, Coatue
Accelerate Critical Business Decisions with AI-Augmented Data Management and Analytics
2016
Founded Pre-
A
Redpoint
Cisco
2017
Series A
CBC
SHUNWEI
2018
Series B
8Roads
2019
Series C
Coatue
© Kyligence Inc. 2019, Confidential.
Trusted by Global Fortune 500
BFSI
Telecom
Technology
Manufacturing,
Retail, Etc.
© Kyligence Inc. 2019, Confidential.
Pains in Collaboration
Data Analyst
Data Engineer
• Manage data source
• Design data model to
keep one source of truth
• ETL and load data
• Develop dashboard/reporting
• Self-service analysis to
answer business questions
Low efficiency in development to fulfill business
requirements
Limited dimension and measures in a model to serve
complex calculations
Difficulty if analytics requirements or source
data change
Time to insight is slow
© Kyligence Inc. 2019, Confidential.
Kyligence Ecosystem
Global Partners
• Fully enabled on leading cloud and data
platforms (Azure, AWS, Google Cloud,
Cloudera)
• Integrated with popular BI and
virtualization (Tableau, Power BI, Qlik,
MicroStrategy)
• Certified on main Hadoop distributions
(CDP)
© Kyligence Inc. 2019, Confidential.
Kyligence Enterprise Accelerate Mission-critical Analytics Intelligently
• Unified Query Entrance
ODBC/JDBC API/SDK
Finance Marketing Sales Customer Checkout
Cube Index
10%4% 80%
RDBM
s Hive
SQL/MDX
Semantic Services
6%
Distributed
Query Engine
AI-Augmented
Engine
Smart
Pushdown
Metadata
Management
Enterprise
Security
• Business Semantic Layer
• Query Pattern for all data
• High Performance Engine
© Kyligence Inc. 2019, Confidential.
Kyligence Cloud
FinanceMarketingSales
Index
more…
Landing &
Transformation
Semantic & Augmentation ApplicationsSource
Azure Blob Storage
Azure Synapse
© Kyligence Inc. 2019, Confidential.
AI Augmented Engine: Intelligent Data Development
© Kyligence Inc. 2019, Confidential.
AI Augmented Engine: One-click Acceleration
• Self-maintaining
• Dynamic auto-modeling
• Self-learning engine
• One-click acceleration
• Adaptive model
© Kyligence Inc. 2019, Confidential.
AI-Augmented Engine — Learn From Your Analytics History
© Kyligence Inc. 2019, Confidential.
Advanced Tuning Features – Push Down and Aggregate Index
© Kyligence Inc. 2019, Confidential.
Under the hood : Smart Cuboids
• Each Model consists of N-Dimension Cuboids which is a
combination of several dimension in different permutations and
combinations.
• Apache Spark is used to build the cuboids making query results
extremely fast.
• When the user sends a query the model intelligently looks for
the Cuboids/segment returns the results extremely fast.
© Kyligence Inc. 2019, Confidential.
Unified Semantic Layer
BI Integration Access Control
Enterprise Security
Query Engine Model
Query Platform
Data Sources
Excel MicroStrategy Other BI Tools
Semantic Layer
Cloud DW Parquet ORC
Blob
Storage
CSVSnowflake
• Translate technical details into
business terminology
• Synchronize semantics across major
BI tools
• Unified business definitions
• Flexible business calculations
© Kyligence Inc. 2019, Confidential.
Elastic Scaling — Handle Peak Time Automatically
 Fewer compute and storage resources
utilized
 Dynamic on-demand cluster resizing
 Uses spot instances
 Efficient planning for data growth
© Kyligence Inc. 2019, Confidential.
TPC-H 22 Queries
SF=50
Query Response Time | 0.5 Billion
SF=500
Query Response Time | 5 Billion
• No warm up
• Lower is better
• Run each query 3 times
• Record the average time
For each Dataset:
© Kyligence Inc. 2019, Confidential.
Financial Risk Management - replacing the large SSAS cube
Challenges Kyligence’s Solution
modernization
same data source
same front-end BI
similar OLAP concepts
comparable semantic layer
finer granular access control
Scalability
Performance
Low Cost
• 5TB SSAS cube with 5 Billion rows daily
incremental data
• 14 Lookup tables, half over 20M
cardinalities (largest 200M)
• 600+ dimensions
• 30+ analysis users
• Analysts’ work locked by incremental
loading workload, system crashes
happen frequently
• Poor performance on data loading and
queries (especially on UHC, Count
Distinct, Correlation)
• Limited concurrent users
• Single cube easy management
• Analysts’ work no longer interrupted
• Transparent to business users, same
• analysis tool Excel
• Improved query and loading performance
• Support 1000+ concurrent users
• Meet future requirement - prediction of 40% data
volume growth, migration to cloud, Realtime
THANK YOU

More Related Content

What's hot (18)

PPTX
SnapLogic Technology Open House – January 2018
SnapLogic
 
PPTX
Importance of global certifications
Anjani Phuyal
 
PPTX
AI-Powered Analytics: What It Is and How It’s Powering the Next Generation of...
Tyler Wishnoff
 
PDF
Pivotal Digital Transformation Forum: Requirements to Become a Data-Driven En...
VMware Tanzu
 
PPTX
Qlik sense- Technical Seminar
Sanjana Gondane
 
PPTX
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
DataStax
 
PPTX
Webinar: BI in the Sky - The New Rules of Cloud Analytics
SnapLogic
 
PDF
Augmented OLAP for Big Data
Luke Han
 
PDF
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
Alluxio, Inc.
 
PPTX
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Perficient, Inc.
 
PDF
The API Lie
SnapLogic
 
PDF
Making Better Decisions Using BigData and Analytics
Boaz Ziniman
 
PPTX
SnapLogic Live: Big Data Integration
SnapLogic
 
PPTX
Altis Webinar: Use Cases For The Modern Data Platform
Altis Consulting
 
PPTX
Event Sponsor NetApp - CSO- Jon Kissane
Hostway|HOSTING
 
PPTX
Snaplogic Live: Big Data in Motion
SnapLogic
 
PPTX
Cloud-Con: Integration & Web APIs
SnapLogic
 
PPTX
Introduction to Big Data using AWS Services
Anjani Phuyal
 
SnapLogic Technology Open House – January 2018
SnapLogic
 
Importance of global certifications
Anjani Phuyal
 
AI-Powered Analytics: What It Is and How It’s Powering the Next Generation of...
Tyler Wishnoff
 
Pivotal Digital Transformation Forum: Requirements to Become a Data-Driven En...
VMware Tanzu
 
Qlik sense- Technical Seminar
Sanjana Gondane
 
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
DataStax
 
Webinar: BI in the Sky - The New Rules of Cloud Analytics
SnapLogic
 
Augmented OLAP for Big Data
Luke Han
 
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
Alluxio, Inc.
 
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Perficient, Inc.
 
The API Lie
SnapLogic
 
Making Better Decisions Using BigData and Analytics
Boaz Ziniman
 
SnapLogic Live: Big Data Integration
SnapLogic
 
Altis Webinar: Use Cases For The Modern Data Platform
Altis Consulting
 
Event Sponsor NetApp - CSO- Jon Kissane
Hostway|HOSTING
 
Snaplogic Live: Big Data in Motion
SnapLogic
 
Cloud-Con: Integration & Web APIs
SnapLogic
 
Introduction to Big Data using AWS Services
Anjani Phuyal
 

Similar to Lightning-Fast, Interactive Business Intelligence Performance with MicroStrategy and Kyligence (20)

PPTX
Architecting Snowflake for High Concurrency and High Performance
SamanthaBerlant
 
PPTX
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
Tyler Wishnoff
 
PPTX
Addressing the systemic shortcomings of cloud analytics
SamanthaBerlant
 
PPTX
Enhance Data Governance with Kyligence Unified Semantic Layer
SamanthaBerlant
 
PPTX
Kyligence Cloud 4 - An Overview
SamanthaBerlant
 
PDF
ICP for Data- Enterprise platform for AI, ML and Data Science
Karan Sachdeva
 
PPTX
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
PPTX
Building Enterprise OLAP on Hadoop for FSI
Luke Han
 
PDF
Augmented OLAP Analytics for Big Data
Tyler Wishnoff
 
PDF
Accelerating Innovation with Hybrid Cloud
Jeff Jakubiak
 
PPTX
Building a Modern Analytic Database with Cloudera 5.8
Cloudera, Inc.
 
PPTX
The Cloud - What's different
Chen-Tien Tsai
 
PDF
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han
 
PDF
Building a hybrid, dynamic cloud on an open architecture
Daniel Krook
 
PDF
Cloud the current future v6
Nitai Partners Inc
 
PDF
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
PPTX
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
PPTX
SQL + Hadoop: The High Performance Advantage�
Actian Corporation
 
PPTX
Move Cloud to the Core of your Business Strategy
ZNetLive
 
PDF
Hadoop in the Cloud
IBM Analytics
 
Architecting Snowflake for High Concurrency and High Performance
SamanthaBerlant
 
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
Tyler Wishnoff
 
Addressing the systemic shortcomings of cloud analytics
SamanthaBerlant
 
Enhance Data Governance with Kyligence Unified Semantic Layer
SamanthaBerlant
 
Kyligence Cloud 4 - An Overview
SamanthaBerlant
 
ICP for Data- Enterprise platform for AI, ML and Data Science
Karan Sachdeva
 
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Building Enterprise OLAP on Hadoop for FSI
Luke Han
 
Augmented OLAP Analytics for Big Data
Tyler Wishnoff
 
Accelerating Innovation with Hybrid Cloud
Jeff Jakubiak
 
Building a Modern Analytic Database with Cloudera 5.8
Cloudera, Inc.
 
The Cloud - What's different
Chen-Tien Tsai
 
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han
 
Building a hybrid, dynamic cloud on an open architecture
Daniel Krook
 
Cloud the current future v6
Nitai Partners Inc
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
ConnectaDigital
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
SQL + Hadoop: The High Performance Advantage�
Actian Corporation
 
Move Cloud to the Core of your Business Strategy
ZNetLive
 
Hadoop in the Cloud
IBM Analytics
 
Ad

More from Tyler Wishnoff (8)

PPTX
Snowflake: The Good, the Bad, and the Ugly
Tyler Wishnoff
 
PPTX
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
Tyler Wishnoff
 
PPTX
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Tyler Wishnoff
 
PPTX
Analysis of the Pressure Placed on Medical Systems during the COVID-19 Pandemic
Tyler Wishnoff
 
PDF
Apache Kylin Meetup: Berlin - With OLX Group
Tyler Wishnoff
 
PDF
Apache Kylin Data Summit 2019: Kyligence Presentation
Tyler Wishnoff
 
PPTX
Augmented OLAP for Big Data Analytics
Tyler Wishnoff
 
PDF
Accelerating Big Data Analytics with Apache Kylin
Tyler Wishnoff
 
Snowflake: The Good, the Bad, and the Ugly
Tyler Wishnoff
 
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
Tyler Wishnoff
 
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Tyler Wishnoff
 
Analysis of the Pressure Placed on Medical Systems during the COVID-19 Pandemic
Tyler Wishnoff
 
Apache Kylin Meetup: Berlin - With OLX Group
Tyler Wishnoff
 
Apache Kylin Data Summit 2019: Kyligence Presentation
Tyler Wishnoff
 
Augmented OLAP for Big Data Analytics
Tyler Wishnoff
 
Accelerating Big Data Analytics with Apache Kylin
Tyler Wishnoff
 
Ad

Recently uploaded (20)

PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PDF
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
DOC
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
PPT
Data base management system Transactions.ppt
gandhamcharan2006
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
2_Management_of_patients_with_Reproductive_System_Disorders.pdf
motbayhonewunetu
 
MATRIX_AMAN IRAWAN_20227479046.docbbbnnb
vanitafiani1
 
Data base management system Transactions.ppt
gandhamcharan2006
 
deep dive data management sharepoint apps.ppt
novaprofk
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
apidays Helsinki & North 2025 - Agentic AI: A Friend or Foe?, Merja Kajava (A...
apidays
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
apidays Singapore 2025 - From Data to Insights: Building AI-Powered Data APIs...
apidays
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
Climate Action.pptx action plan for climate
justfortalabat
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrategy and Kyligence

  • 2. © Kyligence Inc. 2019, Confidential. Apache Kylin Top Level Apache Project  The only open-source OLAP on big data platform Best Open-source Big Data Tool  InfoWorld’s Bossies (Best of Open Source Software Awards) in 2015 & 2016 Sub-Second Interactive Query  Large scale, high concurrency, sub second query latency, multi-dimension 1000+ Organizations  Adopted by thousands of organizations globally
  • 3. © Kyligence Inc. 2019, Confidential. Kyligence = Kylin + Intelligence • Founded in 2016 by the creators of Apache Kylin • Built around Kylin, with augmented AI and enhanced to deliver unprecedented enterprise analytic performance • CRN Top-10 big data startups in 2018 • Global Presence: San Jose, Seattle, New York, Shanghai, Beijing • VCs: Fidelity International, Shunwei Capital, Broadband Capital, Redpoint, Cisco, Coatue Accelerate Critical Business Decisions with AI-Augmented Data Management and Analytics 2016 Founded Pre- A Redpoint Cisco 2017 Series A CBC SHUNWEI 2018 Series B 8Roads 2019 Series C Coatue
  • 4. © Kyligence Inc. 2019, Confidential. Trusted by Global Fortune 500 BFSI Telecom Technology Manufacturing, Retail, Etc.
  • 5. © Kyligence Inc. 2019, Confidential. Pains in Collaboration Data Analyst Data Engineer • Manage data source • Design data model to keep one source of truth • ETL and load data • Develop dashboard/reporting • Self-service analysis to answer business questions Low efficiency in development to fulfill business requirements Limited dimension and measures in a model to serve complex calculations Difficulty if analytics requirements or source data change Time to insight is slow
  • 6. © Kyligence Inc. 2019, Confidential. Kyligence Ecosystem Global Partners • Fully enabled on leading cloud and data platforms (Azure, AWS, Google Cloud, Cloudera) • Integrated with popular BI and virtualization (Tableau, Power BI, Qlik, MicroStrategy) • Certified on main Hadoop distributions (CDP)
  • 7. © Kyligence Inc. 2019, Confidential. Kyligence Enterprise Accelerate Mission-critical Analytics Intelligently • Unified Query Entrance ODBC/JDBC API/SDK Finance Marketing Sales Customer Checkout Cube Index 10%4% 80% RDBM s Hive SQL/MDX Semantic Services 6% Distributed Query Engine AI-Augmented Engine Smart Pushdown Metadata Management Enterprise Security • Business Semantic Layer • Query Pattern for all data • High Performance Engine
  • 8. © Kyligence Inc. 2019, Confidential. Kyligence Cloud FinanceMarketingSales Index more… Landing & Transformation Semantic & Augmentation ApplicationsSource Azure Blob Storage Azure Synapse
  • 9. © Kyligence Inc. 2019, Confidential. AI Augmented Engine: Intelligent Data Development
  • 10. © Kyligence Inc. 2019, Confidential. AI Augmented Engine: One-click Acceleration • Self-maintaining • Dynamic auto-modeling • Self-learning engine • One-click acceleration • Adaptive model
  • 11. © Kyligence Inc. 2019, Confidential. AI-Augmented Engine — Learn From Your Analytics History
  • 12. © Kyligence Inc. 2019, Confidential. Advanced Tuning Features – Push Down and Aggregate Index
  • 13. © Kyligence Inc. 2019, Confidential. Under the hood : Smart Cuboids • Each Model consists of N-Dimension Cuboids which is a combination of several dimension in different permutations and combinations. • Apache Spark is used to build the cuboids making query results extremely fast. • When the user sends a query the model intelligently looks for the Cuboids/segment returns the results extremely fast.
  • 14. © Kyligence Inc. 2019, Confidential. Unified Semantic Layer BI Integration Access Control Enterprise Security Query Engine Model Query Platform Data Sources Excel MicroStrategy Other BI Tools Semantic Layer Cloud DW Parquet ORC Blob Storage CSVSnowflake • Translate technical details into business terminology • Synchronize semantics across major BI tools • Unified business definitions • Flexible business calculations
  • 15. © Kyligence Inc. 2019, Confidential. Elastic Scaling — Handle Peak Time Automatically  Fewer compute and storage resources utilized  Dynamic on-demand cluster resizing  Uses spot instances  Efficient planning for data growth
  • 16. © Kyligence Inc. 2019, Confidential. TPC-H 22 Queries SF=50 Query Response Time | 0.5 Billion SF=500 Query Response Time | 5 Billion • No warm up • Lower is better • Run each query 3 times • Record the average time For each Dataset:
  • 17. © Kyligence Inc. 2019, Confidential. Financial Risk Management - replacing the large SSAS cube Challenges Kyligence’s Solution modernization same data source same front-end BI similar OLAP concepts comparable semantic layer finer granular access control Scalability Performance Low Cost • 5TB SSAS cube with 5 Billion rows daily incremental data • 14 Lookup tables, half over 20M cardinalities (largest 200M) • 600+ dimensions • 30+ analysis users • Analysts’ work locked by incremental loading workload, system crashes happen frequently • Poor performance on data loading and queries (especially on UHC, Count Distinct, Correlation) • Limited concurrent users • Single cube easy management • Analysts’ work no longer interrupted • Transparent to business users, same • analysis tool Excel • Improved query and loading performance • Support 1000+ concurrent users • Meet future requirement - prediction of 40% data volume growth, migration to cloud, Realtime

Editor's Notes

  • #7: UBS case uses databricks
  • #8: UBS case uses databricks
  • #9: Azure storage to be generic, replace Alibaba with Hadoop
  • #10: 灵活的多维建模 模型的变化只影响有关的索引; 模型定义的变化与数据加载互不影响; -------------------- Flexible multidimensional modeling Changes in the model affect only the relevant indexes Changes in model definitions and data loading do not affect each other
  • #11: 灵活的多维建模 模型的变化只影响有关的索引; 模型定义的变化与数据加载互不影响; -------------------- Flexible multidimensional modeling Changes in the model affect only the relevant indexes Changes in model definitions and data loading do not affect each other
  • #17: Industry-recognized data analysis test data sets Analysis of key business decisions Practical business significance 0.5 billion dataset, test TPC-H 22 queries. Test method: 3 times to average, no query engine to warm up. TPC-H Benchmark Examine large volumes of data High complexity queries Answers critical business questions 22 decision making queries E.g. The Shipping Priority Query retrieves the shipping priority and potential revenue of the orders having the largest revenue among those that had not been shipped as of a given date. Top 10 orders are listed in decreasing order of revenue. HARDWARE CONFIGURATION Same 4 physical nodes Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz * 2 Totally 86 vCores, 188 GB mem Same Spark configuration for both KE 4 Beta and SparkSQL 2.4 spark.driver.memory=16g spark.executor.memory=8g spark.yarn.executor.memoryOverhead=2g spark.yarn.am.memory=1024m spark.executor.cores=5 spark.executor.instances=17 Query Response Time | 5 Billion Same 4 physical nodes Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz * 2 Totally 86 vCores, 188 GB mem Same Spark configuration for both KE 4 Beta and SparkSQL 2.4 spark.driver.memory=16g spark.executor.memory=20g spark.yarn.executor.memoryOverhead=2g spark.yarn.am.memory=1024m spark.executor.cores=5 spark.executor.instances=30
  • #18: Benefits: Unlimited scale-out solution to fit future data volume growth 1 hour non-blocking incremental loading Single cube easy maintenance Low infrastructure cost with auto scaling support 100 concurrent users Transparent to business users, same analysis tool Excel Architecture Kyligence Enterprise 4.0 Azure HDInsight 3.6 Azure Data Lake gen2 Cluster size: 30 D3 V2 worker nodes (potentially) ingest data from Oracle Query performance 90% SQL queries within 5s 90% MDX queries within 60s 80% MDX queries within 20s 50 QPS per query node