0% found this document useful (0 votes)
103 views

AtScale Technical Overview

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

AtScale Technical Overview

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Technical Overview

2020

AtScale Technical
Overview
TABLE OF CONTENTS

Why AtScale? ____________________________________________________ 1

AtScale System Overview__________________________________________ 2

AtScale Cloud OLAP____________________________________________ 2

AtScale Universal Semantic Layer™________________________________ 3

AtScale Autonomous Data Engineering™____________________________ 4

AtScale Intelligent Data Virtualization_______________________________ 5

Collaborative Model Management______________________________ 5

Integrated Source Data Discovery_______________________________ 5

Virtual Cube Catalog________________________________________ 5

Data Virtualization_________________________________________ 5

AtScale Design Center__________________________________________ 6

AtScale Data Platform SDK______________________________________ 10

Deployment____________________________________________________ 11

Integrations____________________________________________________ 13

Frequently Asked Questions________________________________________ 14


WHY ATSCALE?
AtScale offers a modern approach to business intelligence and analytics in the cloud. AtScale’s Cloud
OLAP enables analysts to perform sub-second, multidimensional analysis with popular BI tools.
Enterprises rely on AtScale to overcome data and analytics challenges including: accelerating data-
driven decisions at scale, creating one compliant view of business metrics and definitions, controlling
the complexity and costs of analytics and reducing the risk of analytics.

AtScale helps enterprises:

▲ Seamlessly migrate to the cloud. Enterprises are avoiding business disruption and port analytical
workloads without rewriting them.

▲ Simplify the analytics infrastructure. Enterprises can use the best tool and platform for the job
without moving data or adding new data stores.

▲ Modernize and future proof the analytics stack. Enterprises are taking advantage of data lakes
and cloud data warehouses while preparing for future platforms.

▲ Secure and govern data in one place. With a live connection to all data in a virtual cube,
enterprises can stop worrying about data traveling the world on user’s laptops.

▲ Turbocharge analytics and machine learning initiatives. Enterprises are Instantly integrating new
data sources because AtScale delivers a single, super-fast, data-as-a-service API for all data.

▲ See all data in a single, unified view, no matter where it is stored or how it is formatted.

▲ Conduct interactive and multidimensional analyses using business users’ preferred BI tools,
whether that is Excel, Power BI, Tableau, or something else.

▲ Get consistent answers across departments and business units via AtScale’s Universal Semantic
Layer™ that standardizes queries regardless of BI tool or query language.

© 2020 AtScale Inc. All rights reserved. 1


ATSCALE SYSTEM OVERVIEW
AtScale provides a single, secured and governed workspace for distributed data. The combination of
AtScale’s Cloud OLAP, Universal Semantic Layer™, Autonomous Data Engineering™ and Intelligent
Data Virtualization powers business intelligence (BI), artificial intelligence (AI) and machine learning
(ML) initiatives resulting in faster, more accurate business decisions at scale.

AtScale Cloud OLAP


AtScale provides a Cloud OLAP (or COLAP) solution that involves hosting an OLAP server on a Cloud
platform. This lets you create a virtual OLAP cube on top of Snowflake, Google BigQuery, Amazon
Redshift, Azure SQL Server, Hadoop, Oracle, Teradata and more. With COLAP, you can keep the same
great SSAS MDX power with a platform that is built for today’s data types and scalability.

AT S C A L E C LO U D O L A P I N T W O S T E P S

© 2020 AtScale Inc. All rights reserved. 2


AtScale Cloud OLAP:

▲ Works directly on any data platform - Cloud Data Lake, Snowflake, Google BigQuery, AWS Redshift,
Teradata Cloud, Oracle Cloud, and more - without the need to extract data or build a physical cube

▲ Scales with your data platform rather than requiring you to provision a separate environment for
hosting your cubes

▲ Handles any amount of data because it doesn’t require that you pre-compute every possible
combination of dimensions and measures

▲ Doesn’t require ETL as it models cubes virtually without data engineering

▲ Handles data in different data stores and locations via Intelligent Data Virtualization

▲ Can be deployed in any Linux ecosystem

AtScale’s Universal Semantic Layer


The best way to get everyone on the same page is to have everyone speaking the same language. This
ensures that there won’t be conflicting answers to the same questions. A single, centralized workspace
for business metrics and definitions is key to offering one consistent, compliant view of data to both
business users and data scientists alike.

AtScale’s Universal Semantic Layer unifies semantic definitions and metrics for data and makes it
available in one location for BI, AI and ML applications. It works on data anywhere whether it’s on-
premises or in the cloud. With AtScale’s semantic layer, you can move data “as is” to a different data
platform or environment without disrupting your end users.

How AtScale’s Universal Semantic Layer Works

The AtScale Universal Semantic Layer models data as virtual cubes using dimensions, hierarchies,
measures, attributes and calculations. It creates a business-centric view of the data using an OLAP
modeling paradigm. Using familiar modeling constructs, users can quickly source and model any data,
creating virtual views regardless of where the data lives or how it is stored.

Once a model is created, it can be published and accessed via ODBC, JDBC, MDX or REST. The AtScale
Universal Semantic Layer allows any BI tool to connect to AtScale and access models without the need
for additional data engineering or custom client installed drivers. It virtualizes both the data platforms
and the data analytics tools through centralized management and governance.

© 2020 AtScale Inc. All rights reserved. 3


The AtScale query service behaves like a data warehouse for SQL-speaking clients and an OLAP cube
for MDX-speaking clients. The AtScale service intercepts client queries, translates the virtualized
queries into physical queries and passes those queries onto the underlying physical data warehouse
or data lake for execution. As end users interact with the data in the virtual cube, AtScale will
automatically create or modify aggregates through its acceleration structures technology. It creates the
aggregates on the source data platform and determines the optimal location to store those aggregates
in a federated query scenario. AtScale’s automated tuning functionality works consistently regardless of
the underlying data platform (data warehouse or data lake) or location (on-premises or cloud).

AtScale Autonomous Data Engineering


Gathering live data from multiple sources across the organization can be a long, manual process.
Data engineers should be creating new value for the business rather than simply preparing and
moving data for business reporting.

Business users want self service data access and they want it yesterday. AtScale’s Autonomous Data
Engineering™ technology identifies query patterns and creates and manages intelligent aggregates,
just like the data engineering team would do. The AI-driven optimizer learns from user behavior
and data relationships and takes care of data updates and changes, so business users can focus on
gathering insights from data and data engineers can focus on other projects.

With AtScale, the moment a model is published, data access is “live”. AtScale builds aggregates in
real-time in response to user activity and automatically tunes queries without additional manual
intervention.

© 2020 AtScale Inc. All rights reserved. 4


AtScale Intelligent Data Virtualization

Migrating data to the cloud can come with sticker shock. The ease of spinning up new servers (and
forgetting to shut them down) is the perfect recipe for ever-escalating costs. On the cloud data platform
side, a malformed query can cost thousands of dollars if you’re on a consumption-based pricing plan
and that same query can eat up all your available resources if you’re on a fixed pricing plan.

AtScale’s Intelligent Data VirtualizationTM automates the sourcing, curation and modeling of data on
premises or in the cloud. It blends live data from multiple data sources into virtual cubes. Virtualization
makes IT more agile with the ability to store data in the most suitable platforms while providing the
flexibility to adopt new platforms in the future without re-architecting their stack or disrupting their
downstream data consumers.

AtScale’s Intelligent Data Virtualization provides access to enterprise data by functioning as an


abstraction layer on top a variety of data platforms but without manually moving data. Features include:

▲ Collaborative Model Management - A browser-based, multi-user UI that provides a unified


modeling environment, built for sharing and re-use.

▲ Integrated Source Data Discovery - Automatic metadata discovery and synchronization with your
data platforms when working with source data.

▲ Virtual Cube Catalog - A browser-based UI that allows data consumers to search and discover
published virtual cubes and request access from the data owner.

▲ Data Virtualization - First generation data virtualization was not designed for the large
analytical workloads that are typical of today’s BI and AI use cases. AtScale’s deep expertise in
multidimensional analytics along with federated query processing provides unparalleled support
for BI and AI tools alike. While first generation data virtualization lacked automated performance
management, AtScale’s autonomous data engineering accelerates queries by orders of magnitude
without manual intervention and data engineering.

AtScale’s patented security capabilities respect native data warehousing security by supporting end-
to-end user delegation and impersonation. AtScale’s object-level security supports user and group
access rules while providing discoverability for a 360-degree feedback experience with virtual cube
designers. With integrations with enterprise data catalog and governance tools, AtScale can enforce
data governance rules using AtScale’s virtualized governance layer.

© 2020 AtScale Inc. All rights reserved. 5


AtScale Design Center

AtScale Design Center is a browser-based interface that model developers use to create and publish
AtScale virtual cubes. Design Center is organized around the following object model:

▲ Organizations: The ‘top-level’ structure contains sets of users, groups, roles, permissions, and
more. AtScale instances will have at least one Organization. Additional Organizations can be
created and they are mutually exclusive. Nothing is shared between Organizations.

▲ Projects: A Project is a collection of one or more virtual OLAP cubes. Each project contains a shared
Library with objects like Datasets, Dimensions and Calculations that can be shared with other cubes
in that project. There is no limit to how many Projects you can create or how many objects can be
shared.

▲ Cubes: A Cube is created within a Project and can use shared objects from that Project’s Library. A
Cube is a collection of Datasets, Dimensions, Measures, Hierarchies, and Calculations along with
their Relationships that form the basis of a virtual, multidimensional view of your source data.

Models in AtScale visually look like a ‘star’ or ‘snowflake’ schema. There is no dependency for any
particular physical data structure or layout: data can be normalized or denormalized or a little bit of both.

This image shows a customer dimension built from 2 base tables specifying a hierarchy with
Country/Region at the top, following with State/Province, City, and Customers. Customers have
two additional dimensions attached to the Customer ID level format. There is no limit to logical
model creation regardless of what the physical model is on disk.

© 2020 AtScale Inc. All rights reserved. 6


Along with defining dimensions and measures, you can define table relationships for many-to-
one and many-to-many table joins.

There are a number of measures coming from the ‘sales_log’ table, such as Order Quantity and
Sales Amount. Items that are in green text are virtual calculated columns and don’t need to
physically exist in the data. The series of orange lines connecting dimensions to the fact table
denote where join relationships have been defined in the model.

© 2020 AtScale Inc. All rights reserved. 7


Once a model is published, the connection instructions are shown for the various connection
types like JDBC, ODBC and XMLA.

Connection details can be automatically downloaded for tools such as a Tableau (.TDS file).
Once published, the AtScale cube can be accessed via BI/AI tool of choice or in a custom
application.

© 2020 AtScale Inc. All rights reserved. 8


This view of a Tableau report shows the dimensions and measures on the left that have been
defined in an AtScale model and automatically appear in Tableau via an AtScale generated TDS.

© 2020 AtScale Inc. All rights reserved. 9


A view of the same AtScale cube in Excel. The same dimensions and measures appear here as
they did in Tableau but through an XMLA interface instead.

AtScale’s Universal Semantic Layer provides the same logical semantic layer regardless of the BI and/
or AI/ML tool. Users can interact with data using the same dimensions, hierarchies, and measures
defined in Design Center. With AtScale, data is delivered as a service to all data consumers without
any restrictions to share and collaborate delivered at the speed of thought with AtScale’s Acceleration
Structures.

AtScale Data Platform Software Development Kit


AtScale’s Data Platform Software Development Kit (SDK) makes it easy to extend support to new data
sources. The SDK enables developers to add support for new data platforms and include platform
specific optimizations that enhance performance and add additional capabilities. The SDK is plug and
playable, making it easy for 3rd party developers to add support without requiring changes to the core
AtScale platform.

© 2020 AtScale Inc. All rights reserved. 10


DEPLOYMENT
AtScale micro-services install on a Linux server or virtual machine within any environment, either
on-premise or in the cloud. The AtScale instance serves both as a query endpoint for BI/AI tools and
modeling endpoint for AtScale Design Center, a browser-based design environment for creating and
managing virtual cubes.

After creating a model in AtScale Design Center and publishing it, data is available for querying from
any BI/AI tool or custom application via SQL (ODBC/JDBC) or MDX (OLE DB/ XMLA). AtScale has a
zero client-side footprint: there’s no need to install AtScale-specific drivers or applications on client
machines to query the AtScale service. With the ODBC/JDBC protocol, AtScale can accept queries
using any Hive/Thrift driver. With the MDX/XMLA protocol, AtScale adheres to the Microsoft SQL Server
Analysis Services (SSAS) XMLA standard so tools like Excel will work “as is” using existing drivers.

© 2020 AtScale Inc. All rights reserved. 11


The AtScale platform consists of multiple services:
1. Agent - Installed on a virtualization fabric node to communicate with the virtualization service.

2. Balancer - Internal load balancers for routing traffic for High Availability (HA).

3. Coordinator - Installed at least 3 nodes for managing High Availability (HA).

4. Database - AtScale internal database for storing application and configuration data.

5. Directory - Internal LDAP directory used if an external directory is not defined.

6. Engine - AtScale query engine service.

7. Health - Health check service.

8. Ingress - Bridge service for virtualization configuration services.

9. Modeler - AtScale Designer Center service.

10. Servicecontrol - AtScale services manager.

11. Virtualization_listener - Virtualization fabric listener service.

12. Virtualization_supervisor - Virtualization fabric supervisor service.

13. Virtualization_worker - Virtualization fabric worker.

AtScale requires the following software to operate:


1. Active Directory, LDAP or cloud based Identity platform

2. External Load Balancer (for HA configurations)

© 2020 AtScale Inc. All rights reserved. 12


INTEGRATIONS
AtScale support for data platforms is more than just SQL dialect translation. AtScale understands
the semantics of the queries as well as the native platforms underlying capabilities for security,
performance and agility. As a result, AtScale can holistically evaluate many different strategies for query
optimization and autonomous data engineering.

AtScale supports the following data platforms:


1. Data Lake SQL Engines

a. Hive

b. Impala

c. Spark

2. Traditional Data Warehouses

a. Teradata

b. Oracle

c. Microsoft SQL Server

d. Postgres

3. Cloud Data Warehouses

a. Snowflake

b. Google BigQuery

c. Amazon Redshift

© 2020 AtScale Inc. All rights reserved. 13


FREQUENTLY ASKED QUESTIONS
What do I need to deploy AtScale?
▲ You need to configure AtScale to point to a supported data platform as listed in the Integrations
section of this document. While not required, you will also want to configure AtScale to access
your directory service (AD/LDAP) and your external load balancer for High Availability (HA)
configurations. For AtScale installation, at least one Linux server or virtual machine is required with
some basic prerequisites to install the AtScale software. For client tool access, you may need the
appropriate JDBC/ODBC drivers if they aren’t already installed. No additional driver is necessary for
Excel or tools that use the XMLA (MDX) protocol.

Is there a trial and/or open-source version of AtScale?


▲ AtScale supports a proof-of-concept trial. Please contact us to discuss your use case and/or project
needs to determine if a proof-of-concept trial would be appropriate.

How does AtScale interact with my data platform?


▲ AtScale acts as a client to your data platform(s) and will generate optimized, platform specific SQL
based on the AtScale model defined in the AtScale Design Center.

▲ Once a cube is published, it is immediately available for BI and/or AI/ML activity. There is no pre-
processing or data movement required when publishing a model. Data consumers can connect to
the AtScale engine via ODBC/JDBC or MDX and begin querying the cube.

▲ AtScale receives these incoming queries from end users acting as a server to the BI/AI tools and
custom applications. AtScale rewrites queries for execution on a data platform and leverages any
available aggregates in the acceleration structures that would be beneficial to the user’s query.

▲ Simultaneously, AtScale’s machine learning algorithms are monitoring user activity and managing
its acceleration structures to automatically optimize query performance. The acceleration

structures are aggregate tables that are created and stored in a schema on your data platform(s).

What are the options for aggregate creation in the acceleration structures?
▲ Acceleration structures may be triggered in 3 different ways.

▲ ‘Demand-based Aggregates’ are generated heuristically based on user query behavior.

© 2020 AtScale Inc. All rights reserved. 14


▲ ‘Predictive Aggregates’ are generated proactively based on model design. For example, dimensional
aggregates may be generated to facilitate fast lookups for building reports.

▲ ‘User-Defined Aggregates’ are defined by the AtScale Design Center user and are stored with the
virtual cube model. Users can specify combinations of dimensions and measures to design an
aggregate manually.

▲ In addition to these types, settings are available for adjusting behavior and thresholds for creating

demand and prediction based aggregates.

How are the acceleration structures managed and kept current?


▲ There are three methods of controlling how and when the acceleration structures are refreshed.

▲ Acceleration structures may be refreshed on a time or calendar basis using AtScale’s built-in
scheduler.

▲ Acceleration structures may be refreshed on a file trigger basis by using AtScale’s file watcher
utility. This method is often used in conjunction with an ETL pipeline to trigger a refresh upon
completion of an ETL flow.

▲ Acceleration structures may be refreshed using AtScale’s REST API. As with the file trigger option,
this method is often used in conjunction with an ETL pipeline workflow..

NOTE: Acceleration structures can be updated either incrementally or in full refresh mode. Incremental
updates allow for the appending of new or changed data whereas a full refresh will rebuild the aggregates
from scratch.

ABOUT ATSCALE
The Global 2000 relies on AtScale – the intelligent data virtualization company – to provide a single, secured and governed
workspace for distributed data. The combination of the company’s Cloud OLAP, Autonomous Data EngineeringTM and Universal
Semantic LayerTM powers business intelligence resulting in faster, more accurate business decisions at scale. For more
information, visit www.atscale.com..atscale.com.

© 2020 AtScale Inc. All rights reserved. 15

You might also like