AtScale Technical Overview
AtScale Technical Overview
2020
AtScale Technical
Overview
TABLE OF CONTENTS
Data Virtualization_________________________________________ 5
Deployment____________________________________________________ 11
Integrations____________________________________________________ 13
▲ Seamlessly migrate to the cloud. Enterprises are avoiding business disruption and port analytical
workloads without rewriting them.
▲ Simplify the analytics infrastructure. Enterprises can use the best tool and platform for the job
without moving data or adding new data stores.
▲ Modernize and future proof the analytics stack. Enterprises are taking advantage of data lakes
and cloud data warehouses while preparing for future platforms.
▲ Secure and govern data in one place. With a live connection to all data in a virtual cube,
enterprises can stop worrying about data traveling the world on user’s laptops.
▲ Turbocharge analytics and machine learning initiatives. Enterprises are Instantly integrating new
data sources because AtScale delivers a single, super-fast, data-as-a-service API for all data.
▲ See all data in a single, unified view, no matter where it is stored or how it is formatted.
▲ Conduct interactive and multidimensional analyses using business users’ preferred BI tools,
whether that is Excel, Power BI, Tableau, or something else.
▲ Get consistent answers across departments and business units via AtScale’s Universal Semantic
Layer™ that standardizes queries regardless of BI tool or query language.
AT S C A L E C LO U D O L A P I N T W O S T E P S
▲ Works directly on any data platform - Cloud Data Lake, Snowflake, Google BigQuery, AWS Redshift,
Teradata Cloud, Oracle Cloud, and more - without the need to extract data or build a physical cube
▲ Scales with your data platform rather than requiring you to provision a separate environment for
hosting your cubes
▲ Handles any amount of data because it doesn’t require that you pre-compute every possible
combination of dimensions and measures
▲ Handles data in different data stores and locations via Intelligent Data Virtualization
AtScale’s Universal Semantic Layer unifies semantic definitions and metrics for data and makes it
available in one location for BI, AI and ML applications. It works on data anywhere whether it’s on-
premises or in the cloud. With AtScale’s semantic layer, you can move data “as is” to a different data
platform or environment without disrupting your end users.
The AtScale Universal Semantic Layer models data as virtual cubes using dimensions, hierarchies,
measures, attributes and calculations. It creates a business-centric view of the data using an OLAP
modeling paradigm. Using familiar modeling constructs, users can quickly source and model any data,
creating virtual views regardless of where the data lives or how it is stored.
Once a model is created, it can be published and accessed via ODBC, JDBC, MDX or REST. The AtScale
Universal Semantic Layer allows any BI tool to connect to AtScale and access models without the need
for additional data engineering or custom client installed drivers. It virtualizes both the data platforms
and the data analytics tools through centralized management and governance.
Business users want self service data access and they want it yesterday. AtScale’s Autonomous Data
Engineering™ technology identifies query patterns and creates and manages intelligent aggregates,
just like the data engineering team would do. The AI-driven optimizer learns from user behavior
and data relationships and takes care of data updates and changes, so business users can focus on
gathering insights from data and data engineers can focus on other projects.
With AtScale, the moment a model is published, data access is “live”. AtScale builds aggregates in
real-time in response to user activity and automatically tunes queries without additional manual
intervention.
Migrating data to the cloud can come with sticker shock. The ease of spinning up new servers (and
forgetting to shut them down) is the perfect recipe for ever-escalating costs. On the cloud data platform
side, a malformed query can cost thousands of dollars if you’re on a consumption-based pricing plan
and that same query can eat up all your available resources if you’re on a fixed pricing plan.
AtScale’s Intelligent Data VirtualizationTM automates the sourcing, curation and modeling of data on
premises or in the cloud. It blends live data from multiple data sources into virtual cubes. Virtualization
makes IT more agile with the ability to store data in the most suitable platforms while providing the
flexibility to adopt new platforms in the future without re-architecting their stack or disrupting their
downstream data consumers.
▲ Integrated Source Data Discovery - Automatic metadata discovery and synchronization with your
data platforms when working with source data.
▲ Virtual Cube Catalog - A browser-based UI that allows data consumers to search and discover
published virtual cubes and request access from the data owner.
▲ Data Virtualization - First generation data virtualization was not designed for the large
analytical workloads that are typical of today’s BI and AI use cases. AtScale’s deep expertise in
multidimensional analytics along with federated query processing provides unparalleled support
for BI and AI tools alike. While first generation data virtualization lacked automated performance
management, AtScale’s autonomous data engineering accelerates queries by orders of magnitude
without manual intervention and data engineering.
AtScale’s patented security capabilities respect native data warehousing security by supporting end-
to-end user delegation and impersonation. AtScale’s object-level security supports user and group
access rules while providing discoverability for a 360-degree feedback experience with virtual cube
designers. With integrations with enterprise data catalog and governance tools, AtScale can enforce
data governance rules using AtScale’s virtualized governance layer.
AtScale Design Center is a browser-based interface that model developers use to create and publish
AtScale virtual cubes. Design Center is organized around the following object model:
▲ Organizations: The ‘top-level’ structure contains sets of users, groups, roles, permissions, and
more. AtScale instances will have at least one Organization. Additional Organizations can be
created and they are mutually exclusive. Nothing is shared between Organizations.
▲ Projects: A Project is a collection of one or more virtual OLAP cubes. Each project contains a shared
Library with objects like Datasets, Dimensions and Calculations that can be shared with other cubes
in that project. There is no limit to how many Projects you can create or how many objects can be
shared.
▲ Cubes: A Cube is created within a Project and can use shared objects from that Project’s Library. A
Cube is a collection of Datasets, Dimensions, Measures, Hierarchies, and Calculations along with
their Relationships that form the basis of a virtual, multidimensional view of your source data.
Models in AtScale visually look like a ‘star’ or ‘snowflake’ schema. There is no dependency for any
particular physical data structure or layout: data can be normalized or denormalized or a little bit of both.
This image shows a customer dimension built from 2 base tables specifying a hierarchy with
Country/Region at the top, following with State/Province, City, and Customers. Customers have
two additional dimensions attached to the Customer ID level format. There is no limit to logical
model creation regardless of what the physical model is on disk.
There are a number of measures coming from the ‘sales_log’ table, such as Order Quantity and
Sales Amount. Items that are in green text are virtual calculated columns and don’t need to
physically exist in the data. The series of orange lines connecting dimensions to the fact table
denote where join relationships have been defined in the model.
Connection details can be automatically downloaded for tools such as a Tableau (.TDS file).
Once published, the AtScale cube can be accessed via BI/AI tool of choice or in a custom
application.
AtScale’s Universal Semantic Layer provides the same logical semantic layer regardless of the BI and/
or AI/ML tool. Users can interact with data using the same dimensions, hierarchies, and measures
defined in Design Center. With AtScale, data is delivered as a service to all data consumers without
any restrictions to share and collaborate delivered at the speed of thought with AtScale’s Acceleration
Structures.
After creating a model in AtScale Design Center and publishing it, data is available for querying from
any BI/AI tool or custom application via SQL (ODBC/JDBC) or MDX (OLE DB/ XMLA). AtScale has a
zero client-side footprint: there’s no need to install AtScale-specific drivers or applications on client
machines to query the AtScale service. With the ODBC/JDBC protocol, AtScale can accept queries
using any Hive/Thrift driver. With the MDX/XMLA protocol, AtScale adheres to the Microsoft SQL Server
Analysis Services (SSAS) XMLA standard so tools like Excel will work “as is” using existing drivers.
2. Balancer - Internal load balancers for routing traffic for High Availability (HA).
4. Database - AtScale internal database for storing application and configuration data.
a. Hive
b. Impala
c. Spark
a. Teradata
b. Oracle
d. Postgres
a. Snowflake
b. Google BigQuery
c. Amazon Redshift
▲ Once a cube is published, it is immediately available for BI and/or AI/ML activity. There is no pre-
processing or data movement required when publishing a model. Data consumers can connect to
the AtScale engine via ODBC/JDBC or MDX and begin querying the cube.
▲ AtScale receives these incoming queries from end users acting as a server to the BI/AI tools and
custom applications. AtScale rewrites queries for execution on a data platform and leverages any
available aggregates in the acceleration structures that would be beneficial to the user’s query.
▲ Simultaneously, AtScale’s machine learning algorithms are monitoring user activity and managing
its acceleration structures to automatically optimize query performance. The acceleration
structures are aggregate tables that are created and stored in a schema on your data platform(s).
What are the options for aggregate creation in the acceleration structures?
▲ Acceleration structures may be triggered in 3 different ways.
▲ ‘User-Defined Aggregates’ are defined by the AtScale Design Center user and are stored with the
virtual cube model. Users can specify combinations of dimensions and measures to design an
aggregate manually.
▲ In addition to these types, settings are available for adjusting behavior and thresholds for creating
▲ Acceleration structures may be refreshed on a time or calendar basis using AtScale’s built-in
scheduler.
▲ Acceleration structures may be refreshed on a file trigger basis by using AtScale’s file watcher
utility. This method is often used in conjunction with an ETL pipeline to trigger a refresh upon
completion of an ETL flow.
▲ Acceleration structures may be refreshed using AtScale’s REST API. As with the file trigger option,
this method is often used in conjunction with an ETL pipeline workflow..
NOTE: Acceleration structures can be updated either incrementally or in full refresh mode. Incremental
updates allow for the appending of new or changed data whereas a full refresh will rebuild the aggregates
from scratch.
ABOUT ATSCALE
The Global 2000 relies on AtScale – the intelligent data virtualization company – to provide a single, secured and governed
workspace for distributed data. The combination of the company’s Cloud OLAP, Autonomous Data EngineeringTM and Universal
Semantic LayerTM powers business intelligence resulting in faster, more accurate business decisions at scale. For more
information, visit www.atscale.com..atscale.com.